subject:"\[jira\] \[Commented\] \(SOLR\-2571\) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup"

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-07 Thread James Dyer (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045523#comment-13045523
]

James Dyer commented on SOLR-2571:
--

I added thresholdTokenFrequency to the SpellCheckComponent wiki page.

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Assignee: Robert Muir
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch,
SOLR-2571.patch, SOLR-2571.solr3.2.patch

When parsing the configuration for thresholdTokenFrequency, the
IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived
NamedList. However, this comes through as a String. Therefore, a
ClassCastException is always thrown whenever this parameter is specified.
The code ought to be doing Float.parseFloat(...) on the value.
This looks like a nice feature to use in cases the data contains misspelled
or rare words leading to spurious correct queries. I would have liked to
have used this with a project we just completed however this bug prevented
that. This issue came up recently in the User's mailing list so I am raising
an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread James Dyer (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044917#comment-13044917
]

James Dyer commented on SOLR-2571:
--

{quote}
what makes this 'decision' of correctlySpelled? Do you know?
{quote}

I took a quick look to find out. Its more complicated than I thought! Here's
the basic jist (I think!) :
- If the instance of SolrSpellChecker returns frequency data and all
suggestions have frequency 0, TRUE.
- If the instance of SolrSpellChecker returns frequency data and any
suggestion have frequency == 0, FALSE.
- If the instance of SolrSpellChecker returns NO frequency data but has
suggestions, OMIT.
- If the instance of SolrSpellChecker returns NO suggestions, FALSE.

Possibly this isn't fully accurate but I'm at least mostly correct here. Seems
like the discrepency with DirectSolrSpellChecker is because it isn't returning
Frequency info?

This all happens in SpellCheckComponent.toNamedList() ... I'm guessing the code
here uses the presence or absence of frequency data as kind of a proxy
indicator whether or not its dealing with IndexBasedSpellChecker or
FileBasedSpellChecker. Possibly it would be better if each instance of
SolrSpellChecker had a isCorrectlySpelled() method that toNamedList() could
call? Maybe I should I go open another jira issue for that?

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch,
SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045029#comment-13045029
]

Robert Muir commented on SOLR-2571:
---

{quote}
This version takes all of DirectSolrSpellChecker's parameters as Integer and
Float objects rather than Strings, as appropriate.
{quote}

Did you maybe upload an older patch? I took a look and it only seems to cutover
the threshold param.

{quote}
I'm not sure if this would have validated any unit tests (I didn't see any
tests that use DirectSolrSpellChecker).
{quote}

There is a test (DirectSolrSpellCheckerTest), but its probably not that great :)

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch,
SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-06 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13045031#comment-13045031
]

Robert Muir commented on SOLR-2571:
---

{quote}
Possibly this isn't fully accurate but I'm at least mostly correct here. Seems
like the discrepency with DirectSolrSpellChecker is because it isn't returning
Frequency info?
{quote}

This sounds like a bug, care to open a separate issue on it? (we can resolve
the int/float stuff here on this one).

The thing certainly intends to return freq info...
{noformat}
SuggestWord[] suggestions = checker.suggestSimilar(new Term(field,
token.toString()),
options.count, options.reader, options.onlyMorePopular, accuracy);
for (SuggestWord suggestion : suggestions)
result.add(token, suggestion.string, suggestion.freq);
{noformat}

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.patch,
SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-03 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043431#comment-13043431
]

Robert Muir commented on SOLR-2571:
---

Thanks for updating the patch!

{quote}
I found that DirectSolrSpellChecker returns results in a slightly different
format than IndexBasedSpellChecker. Is this OK? Can SOLRJ handle this or do we
need to tweak there?
{quote}

Not sure, I have used DirectSolrSpellChecker with solrj and I didn't have any
problems... but that's not saying there isn't one.

{quote}
Also, in one case IndexBasedSpellChecker returns correctlySpelled=false while
DirectSolrSpellChecker returns correctlySpelled=true. Is this discrepancy
valid?
{quote}

I don't know, what makes this 'decision' of correctlySpelled? Do you know?
Remember also the DirectSolrSpellChecker is a different spellchecker totally
than IndexBasedSpellChecker (it uses a fundamentally different algorithm),
although I tried to keep some of the parameters consistent.

Another question is, there are lots of other float/int arguments to
DirectSolrSpellChecker, maybe we should cut all of these over to int and
float while we are here?

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-02 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042683#comment-13042683
]

Robert Muir commented on SOLR-2571:
---

Hi James, I'm confused about this one a little bit. Perhaps
DirectSolrSpellChecker is actually wrong?

If I configure the thing like this:

{noformat}
float name=thresholdTokenFrequency0.07/float
{noformat}

Then it does apply the parameter. I guess what I'm asking is, if in general we
should be using int/float/etc in these types and not str (especially
DirectSolrSpellChecker which takes a lot of numeric parameters but expects them
all to be str). Just glancing through solrconfig.xml its not clear that there
is a precedent, it appears inconsistent as far as numeric parameters.

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-02 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043165#comment-13043165
 ] 

Mike Sokolov commented on SOLR-2571:


sounds like a good case for a config schema

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-02 Thread Robert Muir (JIRA)

[
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043168#comment-13043168
]

Robert Muir commented on SOLR-2571:
---

Mike, I think I agree: currently we are relying upon examples in the wiki, but
in this case one did not exist and it was/is totally confusing.

IndexBasedSpellChecker thresholdTokenFrequency fails with a
ClassCastException on startup
---

Key: SOLR-2571
URL: https://issues.apache.org/jira/browse/SOLR-2571
Project: Solr
Issue Type: Bug
Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
Labels: whereIsHossManWhenYouNeedHim
Fix For: 3.3, 4.0

Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

2011-06-02 Thread Mike Sokolov (JIRA)


[ 
https://issues.apache.org/jira/browse/SOLR-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13043172#comment-13043172
 ] 

Mike Sokolov commented on SOLR-2571:


I posted a patch in SOLR-1758 that has a preliminary schema and implements 
schema-checking when loading config files that could help

 IndexBasedSpellChecker thresholdTokenFrequency fails with a 
 ClassCastException on startup
 ---

 Key: SOLR-2571
 URL: https://issues.apache.org/jira/browse/SOLR-2571
 Project: Solr
  Issue Type: Bug
  Components: spellchecker
Affects Versions: 1.4.1, 3.1, 4.0
Reporter: James Dyer
Priority: Minor
  Labels: whereIsHossManWhenYouNeedHim
 Fix For: 3.3, 4.0

 Attachments: SOLR-2571.patch, SOLR-2571.solr3.2.patch


 When parsing the configuration for thresholdTokenFrequency, the 
 IndexBasedSpellChecker tries to pull a Float from the DataConfig.xml-derrived 
 NamedList.  However, this comes through as a String.  Therefore, a 
 ClassCastException is always thrown whenever this parameter is specified.  
 The code ought to be doing Float.parseFloat(...) on the value.
 This looks like a nice feature to use in cases the data contains misspelled 
 or rare words leading to spurious correct queries.  I would have liked to 
 have used this with a project we just completed however this bug prevented 
 that.  This issue came up recently in the User's mailing list so I am raising 
 an issue now.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

[jira] [Commented] (SOLR-2571) IndexBasedSpellChecker thresholdTokenFrequency fails with a ClassCastException on startup

9 matches

Site Navigation

Mail list logo

Footer information