This is with Solr. The Lucene approach (assuming that is what is in my Java code, shared previously) works flawlessly, albeit with fewer options, AFAIK.

I'm not sure what you mean by "business case"... I'm wanting to spell-check user-supplied text in my Java app. The end-user then activates the spell-checker on the entire text (presumably, a few paragraphs or less). I can use StyledText's capabilities to highlight the misspelled words, and when the user clicks the highlighted word, a menu will appear where he can select a suggested spelling.

But so far, I've had trouble:

 * determining which words are misspelled (because Solr often returns
   suggestions for correctly spelled words).
 * getting coherent suggestions (regardless if the query word is
   misspelled or not).

It's been a bit puzzling (and frustrating)!! it only took me 10 minutes to get the Lucene spell checker working, but I agree that Solr would be the better way to go, if I can ever get it configured properly...

Mark


On 10/1/2015 12:50 PM, Alexandre Rafalovitch wrote:
Is that with Lucene or with Solr? Because Solr has several different
spell-checker modules you can configure.  I would recommend trying
them first.

And, frankly, I still don't know what your business case is.

Regards,
    Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 1 October 2015 at 12:38, Mark Fenbers <mark.fenb...@noaa.gov> wrote:
Yes, and I've spend numerous hours configuring and reconfiguring, and
eventually even starting over, but still have not getting it to work right.
Even now, I'm getting bizarre results.  For example, I query   "NOTE: This
is purely as an example."  and I get back really bizarre suggestions, like
"n ot e" and "n o te" and "n o t e" for the first word which isn't even
misspelled!  The same goes for "purely" and "example" also!  Moreover, I get
extended results showing the frequencies of these suggestions being over
2600 occurrences, when I'm not even using an indexed spell checker.  I'm
only using a file-based spell checker (/usr/shar/dict/words), and the
wordbreak checker.

At this point, I can't even figure out how to narrow down my confusion so
that I can post concise questions to the group.  But I'll get there
eventually, starting with removing the wordbreak checker for the time-being.
Your response was encouraging, at least.

Mark



On 10/1/2015 9:45 AM, Alexandre Rafalovitch wrote:
Hi Mark,

Have you gone through a Solr tutorial yet? If/when you do, you will
see you don't need to code any of this. It is configured as part of
the web-facing total offering which are tweaked by XML configuration
files (or REST API calls). And most of the standard pipelines are
already pre-configured, so you don't need to invent them from scratch.

On your specific question, it would be better to ask what _business_
level functionality you are trying to achieve and see if Solr can help
with that. Starting from Lucene code is less useful :-)

Regards,
     Alex.
----
Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
http://www.solr-start.com/


On 1 October 2015 at 07:48, Mark Fenbers <mark.fenb...@noaa.gov> wrote:

Reply via email to