[
https://issues.apache.org/jira/browse/SOLR-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470295
]
Otis Gospodnetic commented on SOLR-81:
--------------------------------------
Adam,
I took a look at your patch. It looks like you brought over (copied) various
n-gram tokenizer classes and their unit tests that I put in Lucene's
contrib/analyzers/.... . Did you do this on purpose? I intentionally put
those n-gram tokenizers under Lucene's contrib, as they are generic and not
Solr-specific. Thus, the only classes my patch has are classes that are
Solr-specific:
src/java/org/apache/solr/analysis/EdgeNGramTokenizerFactory.java
src/java/org/apache/solr/analysis/NGramTokenizerFactory.java
src/java/org/apache/solr/analysis/BaseTokenizerFactory.java
And instead of copying the source classes from Lucene's contrib/analyzers/....
it adds the new jar built from those sources:
lib/lucene-analyzers-2.1-dev.jar
Plus:
lib/lucene-spellchecker-2.1-dev.jar
example/solr/conf/schema.xml
I have some locally modified code for this issue, that was not a part of the
first patch. I wanted to attach the updated patch assuming you didn't really
want those few generic tokenizer classes copied from Lucene over to Solr, but
because changes are now in two places, so to speak, let's do this to unify our
work:
Could you please:
- open a new LUCENE issue or just reopen the one where I originally attached
this code and post your patch to the Lucene tokenizers there.
- prepare a new patch for this issue and make sure it only contains
Solr-specific classes (see above), plus those 2 Jars.
I'll upload my patch for schema.xml, so you can see my config (your patch
didn't have this), and make sure your changes to the code are in sync with that.
Finally, are you making use of this code somehow already?
One thing that is completely missing from this patch is the RequestHandler that
knows how to take the input (a query string), and get suggestions for
alternative spellings via a SpellChecker instance. I have some
NGramRequestHandler code locally, but the code is unfinished.
> Add Query Spellchecker functionality
> ------------------------------------
>
> Key: SOLR-81
> URL: https://issues.apache.org/jira/browse/SOLR-81
> Project: Solr
> Issue Type: New Feature
> Components: search
> Reporter: Otis Gospodnetic
> Priority: Minor
> Attachments: SOLR-81-edgengram-ngram.patch, SOLR-81-ngram.patch
>
>
> Use the simple approach of n-gramming outside of Solr and indexing n-gram
> documents. For example:
> <doc>
> <field name="word">lettuce</field>
> <field name="start3">let</field>
> <field name="gram3">let ett ttu tuc uce</field>
> <field name="end3">uce</field>
> <field name="start4">lett</field>
> <field name="gram4">lett ettu ttuc tuce</field>
> <field name="end4">tuce</field>
> </doc>
> See:
> http://www.mail-archive.com/[email protected]/msg01254.html
> Java clients: SOLR-20 (add delete commit optimize), SOLR-30 (search)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.