Re: Collations are not working fine.

2015-02-26 Thread Rajesh Hazari
Below is the filed definition that we used its just a basic definition ::

analyzer type=index
tokenizer class=solr.ClassicTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/
  /analyzer
  analyzer type=query
tokenizer class=solr.ClassicTokenizerFactory/
filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt /
filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
ignoreCase=true expand=false  /
filter class=solr.LowerCaseFilterFactory/
filter class=solr.KeywordMarkerFilterFactory
protected=protwords.txt/

  /analyzer




*Rajesh.*


On Thu, Feb 26, 2015 at 2:03 AM, Nitin Solanki nitinml...@gmail.com wrote:

 Hi Rajesh,
 What configuration had you set in your schema.xml?

 On Sat, Feb 14, 2015 at 2:18 AM, Rajesh Hazari rajeshhaz...@gmail.com
 wrote:

  Hi Nitin,
 
  Can u try with the below config, we have these config seems to be working
  for us.
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
 
   str name=queryAnalyzerFieldTypetext_general/str
 
 
lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str
  str name=fieldtextSpell/str
  str name=combineWordstrue/str
  str name=breakWordsfalse/str
  int name=maxChanges5/int
/lst
 
 lst name=spellchecker
  str name=namedefault/str
  str name=fieldtextSpell/str
  str name=classnamesolr.IndexBasedSpellChecker/str
  str name=spellcheckIndexDir./spellchecker/str
  str name=accuracy0.75/str
  float name=thresholdTokenFrequency0.01/float
  str name=buildOnCommittrue/str
  str name=spellcheck.maxResultsForSuggest5/str
   /lst
 
 
/searchComponent
 
 
 
  str name=spellchecktrue/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheck.dictionarywordbreak/str
  int name=spellcheck.count5/int
  str name=spellcheck.alternativeTermCount15/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.onlyMorePopularfalse/str
  str name=spellcheck.extendedResultstrue/str
  str name =spellcheck.maxCollations100/str
  str name=spellcheck.collateParam.mm100%/str
  str name=spellcheck.collateParam.q.opAND/str
  str name=spellcheck.maxCollationTries1000/str
 
 
  *Rajesh.*
 
  On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James 
 james.d...@ingramcontent.com
  
  wrote:
 
   Nitin,
  
   Can you post the full spellcheck response when you query:
  
   q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
  
   James Dyer
   Ingram Content Group
  
  
   -Original Message-
   From: Nitin Solanki [mailto:nitinml...@gmail.com]
   Sent: Friday, February 13, 2015 1:05 AM
   To: solr-user@lucene.apache.org
   Subject: Re: Collations are not working fine.
  
   Hi James Dyer,
 I did the same as you told me. Used
   WordBreakSolrSpellChecker instead of shingles. But still collations are
  not
   coming or working.
   For instance, I tried to get collation of gone with the wind by
  searching
   gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
   getting the suggestions of wtth as *with*, thes as *the*, wint as
 *wind*.
   Also I have documents which contains gone with the wind having 167
  times
   in the documents. I don't know that I am missing something or not.
   Please check my below solr configuration:
  
   *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
   wintwt=jsonindent=trueshards.qt=/spell
  
   *solrconfig.xml:*
  
   searchComponent name=spellcheck class=solr.SpellCheckComponent
   str name=queryAnalyzerFieldTypetextSpellCi/str
   lst name=spellchecker
 str name=namedefault/str
 str name=fieldgram_ci/str
 str name=classnamesolr.DirectSolrSpellChecker/str
 str name=distanceMeasureinternal/str
 float name=accuracy0.5/float
 int name=maxEdits2/int
 int name=minPrefix0/int
 int name=maxInspections5/int
 int name=minQueryLength2/int
 float name=maxQueryFrequency0.9/float
 str name=comparatorClassfreq/str
   /lst
   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldgram/str
 str name=combineWordstrue/str
 str name=breakWordstrue/str
 int name=maxChanges5/int
   /lst
   /searchComponent
  
   requestHandler name=/spell class=solr.SearchHandler
 startup=lazy
   lst name=defaults
 str name=dfgram_ci/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheckon/str
 str name=spellcheck.extendedResultstrue/str
 str name=spellcheck.count25/str
 str name=spellcheck.onlyMorePopulartrue/str
 str name=spellcheck.maxResultsForSuggest1/str

RE: Collations are not working fine.

2015-02-25 Thread Reitzel, Charles
Hi Rajesh,

That was very helpful.   Based on your experience, I dug deeper into it and 
figured out that it does attempt to return collations for single term queries 
in my configuration as well.   However, in the test cases I have been using, 
the suggested correction never gets any hits.   Again, this is based on our use 
cases that always have at least one filter query present.   As soon as I 
dropped the filter query, sure enough, collations were returned for the single 
term.

But this still doesn't solve my original problem:  The original term is never 
included in the collation results (or validated with a query like the suggested 
corrections).   Thus, if it is a valid term, we don't want to throw it away.   
It would be great to have the collator validate it as a term (perhaps 
conditionally, based on the  exactMatchFirst component dictionary parameter).   
But, at this point, I'm happy to just consult the origFreq value in the 
extended results.

Thanks,
Charlie

-Original Message-
From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com] 
Sent: Monday, February 23, 2015 11:14 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi,

we have used spellcheck component the below configs to get a best collation 
(exact collation) when a query has either single term or multiple terms.

As charles, mentioned above we do have a check for getOriginalFrequency() for 
each term in our service before we send spellcheck response to client, this may 
not be the case for you, hope this helps

request-handler name=/select class=solr.SearchHandler
!-- default values for query parameters can be specified, these
 will be overridden by parameters in the request
  --
lst name=defaults
str name=echoParamsexplicit/str
int name=rows100/int
str name=dftextSpell/str
 str name=spellchecktrue/str str 
name=spellcheck.dictionarydefault/str
str name=spellcheck.dictionarywordbreak/str
int name=spellcheck.count5/int
* str name=spellcheck.alternativeTermCount15/str *
* str name=spellcheck.collatetrue/str*
* str name=spellcheck.onlyMorePopularfalse/str*
* str name=spellcheck.extendedResultstrue/str*
* str name =spellcheck.maxCollations100/str*
* str name=spellcheck.collateParam.mm
http://spellcheck.collateParam.mm100%/str*
* str name=spellcheck.collateParam.q.opAND/str*
* str name=spellcheck.maxCollationTries1000/str*
str name=q.opOR/str
.
.
..   /lst /request-handler
.
.
.

searchComponent name=spellcheck class=solr.SpellCheckComponent

 lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldtextSpell/str
str name=combineWordstrue/str
str name=breakWordsfalse/str
int name=maxChanges5/int
  /lst

   lst name=spellchecker
str name=namedefault/str
str name=fieldtextSpell/str
str name=classnamesolr.IndexBasedSpellChecker/str
!-- str name=classnamesolr.DirectSolrSpellChecker/str -- str 
name=spellcheckIndexDir./spellchecker/str
!-- str
name=distanceMeasureorg.apache.lucene.search.spell.JaroWinklerDistance/str--
str name=accuracy0.75/str
float name=thresholdTokenFrequency0.01/float
str name=buildOnCommittrue/str
str name=spellcheck.maxResultsForSuggest5/str
 /lst


  /searchComponent



*Rajesh**.*

On Fri, Feb 20, 2015 at 8:42 AM, Nitin Solanki nitinml...@gmail.com wrote:

 How to get only the best collations whose hits are more and need to 
 sort them?

 On Wed, Feb 18, 2015 at 3:53 AM, Reitzel, Charles  
 charles.reit...@tiaa-cref.org wrote:

  Hi Nitin,
 
  I was trying many different options for a couple different queries.   In
  fact, I have collations working ok now with the Suggester and WFSTLookup.
   The problem may have been due to a different dictionary and/or 
  lookup implementation and the specific options I was sending.
 
  In general, we're using spellcheck for search suggestions.   The
 Suggester
  component (vs. Suggester spellcheck implementation), doesn't handle 
  all
 of
  our cases.  But we can get things working using the spellcheck interface.
  What gives us particular troubles are the cases where a term may be 
  valid by itself, but also be the start of longer words.
 
  The specific terms are acronyms specific to our business.   But I'll
  attempt to show generic examples.
 
  E.g. a partial term like fo can expand to fox, fog, etc. and a 
  full
 term
  like brown can also expand to something like brownstone.   And, yes, the
  collation brownstone fox is nonsense.  But assume, for the sake of 
  argument, it appears in our documents somewhere.
 
  For multiple term query with a spelling error (or partially typed term):
  brown fo
 
  We get collations in order of hits, descending like ...
  brown fox,
  brown fog,
  brownstone fox.
 
  So far, so good.
 
  For a single term query, brown, we get a single suggestion, 
  brownstone
 and
  no collations.
 
  So, we don't know to keep the term brown!
 
  At this point, we need spellcheck.extendedResults=true

Re: Collations are not working fine.

2015-02-25 Thread Nitin Solanki
Hi Rajesh,
What configuration had you set in your schema.xml?

On Sat, Feb 14, 2015 at 2:18 AM, Rajesh Hazari rajeshhaz...@gmail.com
wrote:

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations are
 not
  coming or working.
  For instance, I tried to get collation of gone with the wind by
 searching
  gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
  getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
  Also I have documents which contains gone with the wind having 167
 times
  in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr name=last-components
strspellcheck/str
  /arr
/requestHandler
 
  *Schema.xml: *
 
  field name=gram_ci type=textSpellCi indexed=true stored=true
  multiValued=false/
 
  /fieldTypefieldType name=textSpellCi class=solr.TextField
  positionIncrementGap=100
 analyzer type=index
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  analyzer type=query
  tokenizer class=solr.StandardTokenizerFactory/
  filter class=solr.LowerCaseFilterFactory/
  /analyzer
  /fieldType
 



Re: Collations are not working fine.

2015-02-23 Thread Nitin Solanki
Hi Charles,
 How you patch the suggester to get frequency information in
the spellcheck response?
It's very good. I also want to do that?


On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in the
 spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr name=last-components
strspellcheck/str
  /arr

Re: Collations are not working fine.

2015-02-23 Thread Rajesh Hazari
 Suggestions (spellcheck)/str
  str name=echoParamsexplicit/str
  str name=wtjson/str
  str name=rows0/str
  str name=defTypeedismax/str
  str name=dftext_all/str
  str
  name=flid,name,ticker,entityType,transactionType,accountType/str
  str name=spellchecktrue/str
  str name=spellcheck.count5/str
  str name=spellcheck.dictionarysuggestDictionary/str
  str name=spellcheck.alternativeTermCount5/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.maxCollationTries10/str
  str name=spellcheck.maxCollations5/str
/lst
arr name=last-components
  strsuggestSC/str
/arr
  /requestHandler
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Tuesday, February 17, 2015 3:17 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi Charles,
   Will you please send the configuration which you tried.
  It will help to solve my problem. Have you sorted the collations on hits
 or
  frequencies of suggestions? If you did than please assist me.
 
  On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
  charles.reit...@tiaa-cref.org wrote:
 
   I have been working with collations the last couple days and I kept
  adding
   the collation-related parameters until it started working for me.   It
   seems I needed str name=spellcheck.collateMaxCollectDocs50/str.
  
   But, I am using the Suggester with the WFSTLookupFactory.
  
   Also, I needed to patch the suggester to get frequency information in
   the spellcheck response.
  
   -Original Message-
   From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
   Sent: Friday, February 13, 2015 3:48 PM
   To: solr-user@lucene.apache.org
   Subject: Re: Collations are not working fine.
  
   Hi Nitin,
  
   Can u try with the below config, we have these config seems to be
   working for us.
  
   searchComponent name=spellcheck class=solr.SpellCheckComponent
  
str name=queryAnalyzerFieldTypetext_general/str
  
  
 lst name=spellchecker
   str name=namewordbreak/str
   str name=classnamesolr.WordBreakSolrSpellChecker/str
   str name=fieldtextSpell/str
   str name=combineWordstrue/str
   str name=breakWordsfalse/str
   int name=maxChanges5/int
 /lst
  
  lst name=spellchecker
   str name=namedefault/str
   str name=fieldtextSpell/str
   str name=classnamesolr.IndexBasedSpellChecker/str
   str name=spellcheckIndexDir./spellchecker/str
   str name=accuracy0.75/str
   float name=thresholdTokenFrequency0.01/float
   str name=buildOnCommittrue/str
   str name=spellcheck.maxResultsForSuggest5/str
/lst
  
  
 /searchComponent
  
  
  
   str name=spellchecktrue/str
   str name=spellcheck.dictionarydefault/str
   str name=spellcheck.dictionarywordbreak/str
   int name=spellcheck.count5/int
   str name=spellcheck.alternativeTermCount15/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.onlyMorePopularfalse/str
   str name=spellcheck.extendedResultstrue/str
   str name =spellcheck.maxCollations100/str
   str name=spellcheck.collateParam.mm100%/str
   str name=spellcheck.collateParam.q.opAND/str
   str name=spellcheck.maxCollationTries1000/str
  
  
   *Rajesh.*
  
   On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James
   james.d...@ingramcontent.com
   
   wrote:
  
Nitin,
   
Can you post the full spellcheck response when you query:
   
q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
   
James Dyer
Ingram Content Group
   
   
-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com]
Sent: Friday, February 13, 2015 1:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.
   
Hi James Dyer,
  I did the same as you told me. Used
WordBreakSolrSpellChecker instead of shingles. But still collations
are not coming or working.
For instance, I tried to get collation of gone with the wind by
searching gone wthh thes wint on field=gram_ci but didn't succeed.
Even, I am getting the suggestions of wtth as *with*, thes as *the*,
   wint as *wind*.
Also I have documents which contains gone with the wind having 167
times in the documents. I don't know that I am missing something or
  not.
Please check my below solr configuration:
   
*URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
wintwt=jsonindent=trueshards.qt=/spell
   
*solrconfig.xml:*
   
searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpellCi/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldgram_ci/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=distanceMeasureinternal/str
  float name=accuracy0.5/float
  int name=maxEdits2/int
  int name

RE: Collations are not working fine.

2015-02-23 Thread Reitzel, Charles
I filed issue SOLR-7144 with the patch attached.   It's probably best to get 
some feedback from developers.  It may not be the right approach, etc.

Also, spellcheck.maxCollationTries  0 is the parameter needed to get collation 
results that respect the current filter queries, etc.

Set spellcheck.maxCollations  1 to get multiple collation results.   However, 
if the original query has only a single term, there will be no collation 
results.   Thus, for single term queries, you need to look at the original 
frequency information to determine if the original term is valid or not.   
There may be spellcheck suggestions even for terms with origFreq  0.

-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Monday, February 23, 2015 11:35 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi Charles,
 How you patch the suggester to get frequency information in the 
spellcheck response?
It's very good. I also want to do that?


On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles  
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in 
 the spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be 
 working for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James 
 james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used 
  WordBreakSolrSpellChecker instead of shingles. But still collations 
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by 
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167 
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes 
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue

Re: Collations are not working fine.

2015-02-20 Thread Nitin Solanki
How to get only the best collations whose hits are more and need to sort
them?

On Wed, Feb 18, 2015 at 3:53 AM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 Hi Nitin,

 I was trying many different options for a couple different queries.   In
 fact, I have collations working ok now with the Suggester and WFSTLookup.
  The problem may have been due to a different dictionary and/or lookup
 implementation and the specific options I was sending.

 In general, we're using spellcheck for search suggestions.   The Suggester
 component (vs. Suggester spellcheck implementation), doesn't handle all of
 our cases.  But we can get things working using the spellcheck interface.
 What gives us particular troubles are the cases where a term may be valid
 by itself, but also be the start of longer words.

 The specific terms are acronyms specific to our business.   But I'll
 attempt to show generic examples.

 E.g. a partial term like fo can expand to fox, fog, etc. and a full term
 like brown can also expand to something like brownstone.   And, yes, the
 collation brownstone fox is nonsense.  But assume, for the sake of
 argument, it appears in our documents somewhere.

 For multiple term query with a spelling error (or partially typed term):
 brown fo

 We get collations in order of hits, descending like ...
 brown fox,
 brown fog,
 brownstone fox.

 So far, so good.

 For a single term query, brown, we get a single suggestion, brownstone and
 no collations.

 So, we don't know to keep the term brown!

 At this point, we need spellcheck.extendedResults=true and look at the
 origFreq value in the suggested corrections.  Unfortunately, the Suggester
 (spellcheck dictionary) does not populate the original frequency
 information.  And, without this information, the SpellCheckComponent cannot
 format the extended results.

 However, with a simple change to Suggester.java, it was easy to get the
 needed frequency information use it to make a sound decision to keep or
 drop the input term.   But I'd be much obliged if there is a better way to
 go about it.

 Configs below.

 Thanks,
 Charlie

 !-- SpellCheck component --
   searchComponent class=solr.SpellCheckComponent name=suggestSC
 lst name=spellchecker
   str name=namesuggestDictionary/str
   str
 name=classnameorg.apache.solr.spelling.suggest.Suggester/str
   str
 name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
   str name=fieldtext_all/str
   float name=threshold0.0001/float
   str name=exactMatchFirsttrue/str
   str name=buildOnCommittrue/str
 /lst
   /searchComponent

 !-- Request Handler --
 requestHandler name=/tcSuggest class=solr.SearchHandler
   lst name=defaults
 str name=titleSearch Suggestions (spellcheck)/str
 str name=echoParamsexplicit/str
 str name=wtjson/str
 str name=rows0/str
 str name=defTypeedismax/str
 str name=dftext_all/str
 str
 name=flid,name,ticker,entityType,transactionType,accountType/str
 str name=spellchecktrue/str
 str name=spellcheck.count5/str
 str name=spellcheck.dictionarysuggestDictionary/str
 str name=spellcheck.alternativeTermCount5/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.extendedResultstrue/str
 str name=spellcheck.maxCollationTries10/str
 str name=spellcheck.maxCollations5/str
   /lst
   arr name=last-components
 strsuggestSC/str
   /arr
 /requestHandler

 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Tuesday, February 17, 2015 3:17 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Charles,
  Will you please send the configuration which you tried.
 It will help to solve my problem. Have you sorted the collations on hits or
 frequencies of suggestions? If you did than please assist me.

 On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
 charles.reit...@tiaa-cref.org wrote:

  I have been working with collations the last couple days and I kept
 adding
  the collation-related parameters until it started working for me.   It
  seems I needed str name=spellcheck.collateMaxCollectDocs50/str.
 
  But, I am using the Suggester with the WFSTLookupFactory.
 
  Also, I needed to patch the suggester to get frequency information in
  the spellcheck response.
 
  -Original Message-
  From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
  Sent: Friday, February 13, 2015 3:48 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi Nitin,
 
  Can u try with the below config, we have these config seems to be
  working for us.
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
 
   str name=queryAnalyzerFieldTypetext_general/str
 
 
lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str
  str name=fieldtextSpell/str
  str name=combineWordstrue/str
  str name=breakWordsfalse/str

Re: Collations are not working fine.

2015-02-17 Thread Nitin Solanki
Hey James Dyer,
 Sorry for late responding because I went out
for couple of days. I have tried out the Rajesh Hazari's configuration
which he pasted inside the mail. It seems to be working. I feel that It is
working because by reducing the *str name=spellcheck.count25/str *to*
str name=spellcheck.count5/str* by which collations come less and
spellcheck.maxCollationTries is able to identify or evaluate the collation
gone with the wind.
But here, the problem is that, hits of gone with the wind are coming
less(only 53) *{Look collations.png}* while there are 394 hits for gone
with the wind, if I tried the correct phrase in param q=gone with the
wind. I got 394 - numFound in response.*{Look response.png}*
Any Idea of it?


On Fri, Feb 13, 2015 at 11:31 PM, Dyer, James james.d...@ingramcontent.com
wrote:

 Nitin,

 Can you post the full spellcheck response when you query:

 q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Friday, February 13, 2015 1:05 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi James Dyer,
   I did the same as you told me. Used
 WordBreakSolrSpellChecker instead of shingles. But still collations are not
 coming or working.
 For instance, I tried to get collation of gone with the wind by searching
 gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
 getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
 Also I have documents which contains gone with the wind having 167 times
 in the documents. I don't know that I am missing something or not.
 Please check my below solr configuration:

 *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
 wintwt=jsonindent=trueshards.qt=/spell

 *solrconfig.xml:*

 searchComponent name=spellcheck class=solr.SpellCheckComponent
 str name=queryAnalyzerFieldTypetextSpellCi/str
 lst name=spellchecker
   str name=namedefault/str
   str name=fieldgram_ci/str
   str name=classnamesolr.DirectSolrSpellChecker/str
   str name=distanceMeasureinternal/str
   float name=accuracy0.5/float
   int name=maxEdits2/int
   int name=minPrefix0/int
   int name=maxInspections5/int
   int name=minQueryLength2/int
   float name=maxQueryFrequency0.9/float
   str name=comparatorClassfreq/str
 /lst
 lst name=spellchecker
   str name=namewordbreak/str
   str name=classnamesolr.WordBreakSolrSpellChecker/str
   str name=fieldgram/str
   str name=combineWordstrue/str
   str name=breakWordstrue/str
   int name=maxChanges5/int
 /lst
 /searchComponent

 requestHandler name=/spell class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=dfgram_ci/str
   str name=spellcheck.dictionarydefault/str
   str name=spellcheckon/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count25/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.maxResultsForSuggest1/str
   str name=spellcheck.alternativeTermCount25/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations50/str
   str name=spellcheck.maxCollationTries50/str
   str name=spellcheck.collateExtendedResultstrue/str
 /lst
 arr name=last-components
   strspellcheck/str
 /arr
   /requestHandler

 *Schema.xml: *

 field name=gram_ci type=textSpellCi indexed=true stored=true
 multiValued=false/

 /fieldTypefieldType name=textSpellCi class=solr.TextField
 positionIncrementGap=100
analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType



Re: Collations are not working fine.

2015-02-17 Thread Nitin Solanki
Hey Rajesh,
 Sorry for late responding because I went out
for couple of days. I have tried out the configuration which you sent me.
Thanks a lot. It seems to be working. I feel that It is working because by
reducing the *str name=spellcheck.count25/str *to* str
name=spellcheck.count5/str* by which collations come less and
spellcheck.maxCollationTries is able to identify or evaluate the collation
gone with the wind.
But here, the problem is that, hits of gone with the wind are coming
less(only 53) *{Look collations.png}* while there are 394 hits for gone
with the wind, if I tried the correct phrase in param q=gone with the
wind. I got 394 - numFound in response.*{Look response.png}*
Any Idea of it?

One more thing to say: You used
str name=spellcheck.collateParam.mm100%/str
str name=spellcheck.collateParam.q.opAND/str
But It doesn't seems to be working. I tried by removing above 2 lines, it
doesn't affect the result. I also changed the value of
spellcheck.collateParam.mm to 0% and spellcheck.collateParam.q.op to OR.
Even it doesn't affect on the results. I am unable to understand what is
spellcheck.collateParam.mm and spellcheck.collateParam.q.op after googling.
Will you please assist me?
Thanks .



On Sat, Feb 14, 2015 at 2:18 AM, Rajesh Hazari rajeshhaz...@gmail.com
wrote:

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations are
 not
  coming or working.
  For instance, I tried to get collation of gone with the wind by
 searching
  gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
  getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
  Also I have documents which contains gone with the wind having 167
 times
  in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue

Re: Collations are not working fine.

2015-02-17 Thread Nitin Solanki
Hi Charles,
 Will you please send the configuration which you tried. It
will help to solve my problem. Have you sorted the collations on hits or
frequencies of suggestions? If you did than please assist me.

On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles 
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in the
 spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be working
 for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name=thresholdTokenFrequency0.01/float
 str name=buildOnCommittrue/str
 str name=spellcheck.maxResultsForSuggest5/str
  /lst


   /searchComponent



 str name=spellchecktrue/str
 str name=spellcheck.dictionarydefault/str
 str name=spellcheck.dictionarywordbreak/str
 int name=spellcheck.count5/int
 str name=spellcheck.alternativeTermCount15/str
 str name=spellcheck.collatetrue/str
 str name=spellcheck.onlyMorePopularfalse/str
 str name=spellcheck.extendedResultstrue/str
 str name =spellcheck.maxCollations100/str
 str name=spellcheck.collateParam.mm100%/str
 str name=spellcheck.collateParam.q.opAND/str
 str name=spellcheck.maxCollationTries1000/str


 *Rajesh.*

 On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
 
 wrote:

  Nitin,
 
  Can you post the full spellcheck response when you query:
 
  q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell
 
  James Dyer
  Ingram Content Group
 
 
  -Original Message-
  From: Nitin Solanki [mailto:nitinml...@gmail.com]
  Sent: Friday, February 13, 2015 1:05 AM
  To: solr-user@lucene.apache.org
  Subject: Re: Collations are not working fine.
 
  Hi James Dyer,
I did the same as you told me. Used
  WordBreakSolrSpellChecker instead of shingles. But still collations
  are not coming or working.
  For instance, I tried to get collation of gone with the wind by
  searching gone wthh thes wint on field=gram_ci but didn't succeed.
  Even, I am getting the suggestions of wtth as *with*, thes as *the*,
 wint as *wind*.
  Also I have documents which contains gone with the wind having 167
  times in the documents. I don't know that I am missing something or not.
  Please check my below solr configuration:
 
  *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
  wintwt=jsonindent=trueshards.qt=/spell
 
  *solrconfig.xml:*
 
  searchComponent name=spellcheck class=solr.SpellCheckComponent
  str name=queryAnalyzerFieldTypetextSpellCi/str
  lst name=spellchecker
str name=namedefault/str
str name=fieldgram_ci/str
str name=classnamesolr.DirectSolrSpellChecker/str
str name=distanceMeasureinternal/str
float name=accuracy0.5/float
int name=maxEdits2/int
int name=minPrefix0/int
int name=maxInspections5/int
int name=minQueryLength2/int
float name=maxQueryFrequency0.9/float
str name=comparatorClassfreq/str
  /lst
  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldgram/str
str name=combineWordstrue/str
str name=breakWordstrue/str
int name=maxChanges5/int
  /lst
  /searchComponent
 
  requestHandler name=/spell class=solr.SearchHandler startup=lazy
  lst name=defaults
str name=dfgram_ci/str
str name=spellcheck.dictionarydefault/str
str name=spellcheckon/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count25/str
str name=spellcheck.onlyMorePopulartrue/str
str name=spellcheck.maxResultsForSuggest1/str
str name=spellcheck.alternativeTermCount25/str
str name=spellcheck.collatetrue/str
str name=spellcheck.maxCollations50/str
str name=spellcheck.maxCollationTries50/str
str name=spellcheck.collateExtendedResultstrue/str
  /lst
  arr

RE: Collations are not working fine.

2015-02-17 Thread Reitzel, Charles
Hi Nitin,

I was trying many different options for a couple different queries.   In fact, 
I have collations working ok now with the Suggester and WFSTLookup.   The 
problem may have been due to a different dictionary and/or lookup 
implementation and the specific options I was sending.

In general, we're using spellcheck for search suggestions.   The Suggester 
component (vs. Suggester spellcheck implementation), doesn't handle all of our 
cases.  But we can get things working using the spellcheck interface.  What 
gives us particular troubles are the cases where a term may be valid by itself, 
but also be the start of longer words.

The specific terms are acronyms specific to our business.   But I'll attempt to 
show generic examples.

E.g. a partial term like fo can expand to fox, fog, etc. and a full term like 
brown can also expand to something like brownstone.   And, yes, the collation 
brownstone fox is nonsense.  But assume, for the sake of argument, it appears 
in our documents somewhere.

For multiple term query with a spelling error (or partially typed term):  brown 
fo

We get collations in order of hits, descending like ...
brown fox,
brown fog,
brownstone fox.

So far, so good.  

For a single term query, brown, we get a single suggestion, brownstone and no 
collations.

So, we don't know to keep the term brown!

At this point, we need spellcheck.extendedResults=true and look at the origFreq 
value in the suggested corrections.  Unfortunately, the Suggester (spellcheck 
dictionary) does not populate the original frequency information.  And, without 
this information, the SpellCheckComponent cannot format the extended results.

However, with a simple change to Suggester.java, it was easy to get the needed 
frequency information use it to make a sound decision to keep or drop the input 
term.   But I'd be much obliged if there is a better way to go about it.

Configs below.

Thanks,
Charlie

!-- SpellCheck component --
  searchComponent class=solr.SpellCheckComponent name=suggestSC
lst name=spellchecker
  str name=namesuggestDictionary/str
  str name=classnameorg.apache.solr.spelling.suggest.Suggester/str
  str 
name=lookupImplorg.apache.solr.spelling.suggest.fst.WFSTLookupFactory/str
  str name=fieldtext_all/str
  float name=threshold0.0001/float
  str name=exactMatchFirsttrue/str
  str name=buildOnCommittrue/str
/lst
  /searchComponent

!-- Request Handler --
requestHandler name=/tcSuggest class=solr.SearchHandler
  lst name=defaults
str name=titleSearch Suggestions (spellcheck)/str
str name=echoParamsexplicit/str
str name=wtjson/str
str name=rows0/str
str name=defTypeedismax/str
str name=dftext_all/str
str name=flid,name,ticker,entityType,transactionType,accountType/str
str name=spellchecktrue/str
str name=spellcheck.count5/str
str name=spellcheck.dictionarysuggestDictionary/str
str name=spellcheck.alternativeTermCount5/str
str name=spellcheck.collatetrue/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.maxCollationTries10/str
str name=spellcheck.maxCollations5/str
  /lst
  arr name=last-components
strsuggestSC/str
  /arr
/requestHandler

-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Tuesday, February 17, 2015 3:17 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi Charles,
 Will you please send the configuration which you tried. It 
will help to solve my problem. Have you sorted the collations on hits or 
frequencies of suggestions? If you did than please assist me.

On Mon, Feb 16, 2015 at 7:59 PM, Reitzel, Charles  
charles.reit...@tiaa-cref.org wrote:

 I have been working with collations the last couple days and I kept adding
 the collation-related parameters until it started working for me.   It
 seems I needed str name=spellcheck.collateMaxCollectDocs50/str.

 But, I am using the Suggester with the WFSTLookupFactory.

 Also, I needed to patch the suggester to get frequency information in 
 the spellcheck response.

 -Original Message-
 From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com]
 Sent: Friday, February 13, 2015 3:48 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi Nitin,

 Can u try with the below config, we have these config seems to be 
 working for us.

 searchComponent name=spellcheck class=solr.SpellCheckComponent

  str name=queryAnalyzerFieldTypetext_general/str


   lst name=spellchecker
 str name=namewordbreak/str
 str name=classnamesolr.WordBreakSolrSpellChecker/str
 str name=fieldtextSpell/str
 str name=combineWordstrue/str
 str name=breakWordsfalse/str
 int name=maxChanges5/int
   /lst

lst name=spellchecker
 str name=namedefault/str
 str name=fieldtextSpell/str
 str name=classnamesolr.IndexBasedSpellChecker/str
 str name=spellcheckIndexDir./spellchecker/str
 str name=accuracy0.75/str
 float name

RE: Collations are not working fine.

2015-02-16 Thread Reitzel, Charles
I have been working with collations the last couple days and I kept adding the 
collation-related parameters until it started working for me.   It seems I 
needed str name=spellcheck.collateMaxCollectDocs50/str.  

But, I am using the Suggester with the WFSTLookupFactory.

Also, I needed to patch the suggester to get frequency information in the 
spellcheck response.

-Original Message-
From: Rajesh Hazari [mailto:rajeshhaz...@gmail.com] 
Sent: Friday, February 13, 2015 3:48 PM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi Nitin,

Can u try with the below config, we have these config seems to be working for 
us.

searchComponent name=spellcheck class=solr.SpellCheckComponent

 str name=queryAnalyzerFieldTypetext_general/str


  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldtextSpell/str
str name=combineWordstrue/str
str name=breakWordsfalse/str
int name=maxChanges5/int
  /lst

   lst name=spellchecker
str name=namedefault/str
str name=fieldtextSpell/str
str name=classnamesolr.IndexBasedSpellChecker/str
str name=spellcheckIndexDir./spellchecker/str
str name=accuracy0.75/str
float name=thresholdTokenFrequency0.01/float
str name=buildOnCommittrue/str
str name=spellcheck.maxResultsForSuggest5/str
 /lst


  /searchComponent



str name=spellchecktrue/str
str name=spellcheck.dictionarydefault/str
str name=spellcheck.dictionarywordbreak/str
int name=spellcheck.count5/int
str name=spellcheck.alternativeTermCount15/str
str name=spellcheck.collatetrue/str
str name=spellcheck.onlyMorePopularfalse/str
str name=spellcheck.extendedResultstrue/str
str name =spellcheck.maxCollations100/str
str name=spellcheck.collateParam.mm100%/str
str name=spellcheck.collateParam.q.opAND/str
str name=spellcheck.maxCollationTries1000/str


*Rajesh.*

On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
wrote:

 Nitin,

 Can you post the full spellcheck response when you query:

 q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Friday, February 13, 2015 1:05 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi James Dyer,
   I did the same as you told me. Used 
 WordBreakSolrSpellChecker instead of shingles. But still collations 
 are not coming or working.
 For instance, I tried to get collation of gone with the wind by 
 searching gone wthh thes wint on field=gram_ci but didn't succeed. 
 Even, I am getting the suggestions of wtth as *with*, thes as *the*, wint as 
 *wind*.
 Also I have documents which contains gone with the wind having 167 
 times in the documents. I don't know that I am missing something or not.
 Please check my below solr configuration:

 *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes 
 wintwt=jsonindent=trueshards.qt=/spell

 *solrconfig.xml:*

 searchComponent name=spellcheck class=solr.SpellCheckComponent
 str name=queryAnalyzerFieldTypetextSpellCi/str
 lst name=spellchecker
   str name=namedefault/str
   str name=fieldgram_ci/str
   str name=classnamesolr.DirectSolrSpellChecker/str
   str name=distanceMeasureinternal/str
   float name=accuracy0.5/float
   int name=maxEdits2/int
   int name=minPrefix0/int
   int name=maxInspections5/int
   int name=minQueryLength2/int
   float name=maxQueryFrequency0.9/float
   str name=comparatorClassfreq/str
 /lst
 lst name=spellchecker
   str name=namewordbreak/str
   str name=classnamesolr.WordBreakSolrSpellChecker/str
   str name=fieldgram/str
   str name=combineWordstrue/str
   str name=breakWordstrue/str
   int name=maxChanges5/int
 /lst
 /searchComponent

 requestHandler name=/spell class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=dfgram_ci/str
   str name=spellcheck.dictionarydefault/str
   str name=spellcheckon/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count25/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.maxResultsForSuggest1/str
   str name=spellcheck.alternativeTermCount25/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations50/str
   str name=spellcheck.maxCollationTries50/str
   str name=spellcheck.collateExtendedResultstrue/str
 /lst
 arr name=last-components
   strspellcheck/str
 /arr
   /requestHandler

 *Schema.xml: *

 field name=gram_ci type=textSpellCi indexed=true stored=true
 multiValued=false/

 /fieldTypefieldType name=textSpellCi class=solr.TextField
 positionIncrementGap=100
analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type

RE: Collations are not working fine.

2015-02-13 Thread Dyer, James
Nitin,

Can you post the full spellcheck response when you query:

q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell

James Dyer
Ingram Content Group


-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Friday, February 13, 2015 1:05 AM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi James Dyer,
  I did the same as you told me. Used
WordBreakSolrSpellChecker instead of shingles. But still collations are not
coming or working.
For instance, I tried to get collation of gone with the wind by searching
gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
Also I have documents which contains gone with the wind having 167 times
in the documents. I don't know that I am missing something or not.
Please check my below solr configuration:

*URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
wintwt=jsonindent=trueshards.qt=/spell

*solrconfig.xml:*

searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpellCi/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldgram_ci/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=distanceMeasureinternal/str
  float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix0/int
  int name=maxInspections5/int
  int name=minQueryLength2/int
  float name=maxQueryFrequency0.9/float
  str name=comparatorClassfreq/str
/lst
lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str
  str name=fieldgram/str
  str name=combineWordstrue/str
  str name=breakWordstrue/str
  int name=maxChanges5/int
/lst
/searchComponent

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest1/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations50/str
  str name=spellcheck.maxCollationTries50/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler

*Schema.xml: *

field name=gram_ci type=textSpellCi indexed=true stored=true
multiValued=false/

/fieldTypefieldType name=textSpellCi class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType


Re: Collations are not working fine.

2015-02-13 Thread Rajesh Hazari
Hi Nitin,

Can u try with the below config, we have these config seems to be working
for us.

searchComponent name=spellcheck class=solr.SpellCheckComponent

 str name=queryAnalyzerFieldTypetext_general/str


  lst name=spellchecker
str name=namewordbreak/str
str name=classnamesolr.WordBreakSolrSpellChecker/str
str name=fieldtextSpell/str
str name=combineWordstrue/str
str name=breakWordsfalse/str
int name=maxChanges5/int
  /lst

   lst name=spellchecker
str name=namedefault/str
str name=fieldtextSpell/str
str name=classnamesolr.IndexBasedSpellChecker/str
str name=spellcheckIndexDir./spellchecker/str
str name=accuracy0.75/str
float name=thresholdTokenFrequency0.01/float
str name=buildOnCommittrue/str
str name=spellcheck.maxResultsForSuggest5/str
 /lst


  /searchComponent



str name=spellchecktrue/str
str name=spellcheck.dictionarydefault/str
str name=spellcheck.dictionarywordbreak/str
int name=spellcheck.count5/int
str name=spellcheck.alternativeTermCount15/str
str name=spellcheck.collatetrue/str
str name=spellcheck.onlyMorePopularfalse/str
str name=spellcheck.extendedResultstrue/str
str name =spellcheck.maxCollations100/str
str name=spellcheck.collateParam.mm100%/str
str name=spellcheck.collateParam.q.opAND/str
str name=spellcheck.maxCollationTries1000/str


*Rajesh.*

On Fri, Feb 13, 2015 at 1:01 PM, Dyer, James james.d...@ingramcontent.com
wrote:

 Nitin,

 Can you post the full spellcheck response when you query:

 q=gram_ci:gone wthh thes wintwt=jsonindent=trueshards.qt=/spell

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Friday, February 13, 2015 1:05 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Collations are not working fine.

 Hi James Dyer,
   I did the same as you told me. Used
 WordBreakSolrSpellChecker instead of shingles. But still collations are not
 coming or working.
 For instance, I tried to get collation of gone with the wind by searching
 gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
 getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
 Also I have documents which contains gone with the wind having 167 times
 in the documents. I don't know that I am missing something or not.
 Please check my below solr configuration:

 *URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
 wintwt=jsonindent=trueshards.qt=/spell

 *solrconfig.xml:*

 searchComponent name=spellcheck class=solr.SpellCheckComponent
 str name=queryAnalyzerFieldTypetextSpellCi/str
 lst name=spellchecker
   str name=namedefault/str
   str name=fieldgram_ci/str
   str name=classnamesolr.DirectSolrSpellChecker/str
   str name=distanceMeasureinternal/str
   float name=accuracy0.5/float
   int name=maxEdits2/int
   int name=minPrefix0/int
   int name=maxInspections5/int
   int name=minQueryLength2/int
   float name=maxQueryFrequency0.9/float
   str name=comparatorClassfreq/str
 /lst
 lst name=spellchecker
   str name=namewordbreak/str
   str name=classnamesolr.WordBreakSolrSpellChecker/str
   str name=fieldgram/str
   str name=combineWordstrue/str
   str name=breakWordstrue/str
   int name=maxChanges5/int
 /lst
 /searchComponent

 requestHandler name=/spell class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=dfgram_ci/str
   str name=spellcheck.dictionarydefault/str
   str name=spellcheckon/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count25/str
   str name=spellcheck.onlyMorePopulartrue/str
   str name=spellcheck.maxResultsForSuggest1/str
   str name=spellcheck.alternativeTermCount25/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.maxCollations50/str
   str name=spellcheck.maxCollationTries50/str
   str name=spellcheck.collateExtendedResultstrue/str
 /lst
 arr name=last-components
   strspellcheck/str
 /arr
   /requestHandler

 *Schema.xml: *

 field name=gram_ci type=textSpellCi indexed=true stored=true
 multiValued=false/

 /fieldTypefieldType name=textSpellCi class=solr.TextField
 positionIncrementGap=100
analyzer type=index
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 analyzer type=query
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/
 /analyzer
 /fieldType



Re: Collations are not working fine.

2015-02-12 Thread Nitin Solanki
Hi James Dyer,
  I did the same as you told me. Used
WordBreakSolrSpellChecker instead of shingles. But still collations are not
coming or working.
For instance, I tried to get collation of gone with the wind by searching
gone wthh thes wint on field=gram_ci but didn't succeed. Even, I am
getting the suggestions of wtth as *with*, thes as *the*, wint as *wind*.
Also I have documents which contains gone with the wind having 167 times
in the documents. I don't know that I am missing something or not.
Please check my below solr configuration:

*URL: *localhost:8983/solr/wikingram/spell?q=gram_ci:gone wthh thes
wintwt=jsonindent=trueshards.qt=/spell

*solrconfig.xml:*

searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpellCi/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldgram_ci/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=distanceMeasureinternal/str
  float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix0/int
  int name=maxInspections5/int
  int name=minQueryLength2/int
  float name=maxQueryFrequency0.9/float
  str name=comparatorClassfreq/str
/lst
lst name=spellchecker
  str name=namewordbreak/str
  str name=classnamesolr.WordBreakSolrSpellChecker/str
  str name=fieldgram/str
  str name=combineWordstrue/str
  str name=breakWordstrue/str
  int name=maxChanges5/int
/lst
/searchComponent

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest1/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations50/str
  str name=spellcheck.maxCollationTries50/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler

*Schema.xml: *

field name=gram_ci type=textSpellCi indexed=true stored=true
multiValued=false/

/fieldTypefieldType name=textSpellCi class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
/analyzer
/fieldType


RE: Collations are not working fine.

2015-02-10 Thread Dyer, James
Nitin,

I have not tested using shingles with collations but my guess here is the 
collation feature is not going to work as expected with a shingled index.  So 
try re-indexing without the shingles and see if it gives you more intuitive 
results.  If that helps, and if you want to still correct whitespace errors, 
then consider using WordBreakSolrSpellChecker instead of shingles (the main 
solr example demonstrates how).  

Beyond that, without some queries *and* the full spellcheck response, and an 
explanation as to why you feel the spellcheck response is wrong, I'm not sure 
you will get much more help with this.

Here is what hits in the collation response means:

 By hits, it means if you replaced the q parameter on the original
 query but left everything else the same (filters, etc), this is how many
 results you would get.

James Dyer
Ingram Content Group


-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Monday, February 09, 2015 11:38 PM
To: solr-user@lucene.apache.org
Subject: Re: Collations are not working fine.

Hi *James Dyer*
*,*
   I have not done stemming and my
spellcheck.alternativeTermCount is set equals to spellcheck.count. Below, I
have pasted my solrconfig.xml and schema.xml configuration.


*URL: *
localhost:8983/solr/wikingram/spell?q=gram_ci:delighwt=jsonindent=trueshards.qt=/spell

*solrconfig.xml:*

searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpellCi/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldgram_ci/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=distanceMeasureinternal/str
  float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix0/int
  int name=maxInspections5/int
  int name=minQueryLength2/int
  float name=maxQueryFrequency0.9/float
  str name=comparatorClassfreq/str
/lst
/searchComponent

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest1/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations50/str
  str name=spellcheck.maxCollationTries50/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler

*Schema.xml: *

field name=gram_ci type=textSpellCi indexed=true stored=true
multiValued=false/

/fieldTypefieldType name=textSpellCi class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
minShingleSize=2 outputUnigrams=true/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
minShingleSize=2 outputUnigrams=true/
/analyzer
/fieldType

On Tue, Feb 10, 2015 at 1:23 AM, Dyer, James james.d...@ingramcontent.com
wrote:

 Nitin,

 My guess here is that your spellcheck field is a field that has stemming.
 This might be why you get a collation that return wind even though the
 user queried wnd and it does not get any suggestions.  Perhaps wnd is
 stemmed the same as wind ?  (Spellcheck usually works best if you
 copyField the query field to something that is tokenized but not heavily
 analyzed, and use the copy as the spellcheck dictionary.)

 The other problem might be because wind is in the index but you are not
 using spellcheck.alternativeTermCount.  If you set this to the same value
 as spellcheck.count, then it will give suggestions even when words exist
 in the index.

 By hits, it means if you replaced the q parameter on the original
 query but left everything else the same (filters, etc), this is how many
 results you would get.

 If you need more help, please include in your message the pertinent
 sections of solrconfig.xml, schema.xml and also the full query url you are
 using and the full spellcheck response.

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Monday, February 09, 2015 7:47 AM
 To: solr-user@lucene.apache.org
 Subject: Collations are not working fine.

 I am working on spell checking in Solr. I have implemented Suggestions and
 collations in my spell checker component.

 Most of the time collations work fine but in few case it fails.

 *Working*:
 I tried query:*gone wthh thes wnd*: In this wnd doesn't give suggestion
 wind but collation

Re: Collations are not working fine.

2015-02-09 Thread Bill Bell
Can you order the collation a by highest to lowest hits ?

Bill Bell
Sent from mobile


 On Feb 9, 2015, at 6:47 AM, Nitin Solanki nitinml...@gmail.com wrote:
 
 I am working on spell checking in Solr. I have implemented Suggestions and
 collations in my spell checker component.
 
 Most of the time collations work fine but in few case it fails.
 
 *Working*:
 I tried query:*gone wthh thes wnd*: In this wnd doesn't give suggestion
 wind but collation is coming right = gone with the wind, hits = 117
 
 
 *Not working:*
 But when I tried query: *gone wthh thes wint*: In this wint does give
 suggestion wind but collation is not coming right. Instead of gone with
 the wind it gives gone with the west, hits = 1.
 
 And I want to also know what is *hits* in collations.


Re: Collations are not working fine.

2015-02-09 Thread Nitin Solanki
Hi *James Dyer*
*,*
   I have not done stemming and my
spellcheck.alternativeTermCount is set equals to spellcheck.count. Below, I
have pasted my solrconfig.xml and schema.xml configuration.


*URL: *
localhost:8983/solr/wikingram/spell?q=gram_ci:delighwt=jsonindent=trueshards.qt=/spell

*solrconfig.xml:*

searchComponent name=spellcheck class=solr.SpellCheckComponent
str name=queryAnalyzerFieldTypetextSpellCi/str
lst name=spellchecker
  str name=namedefault/str
  str name=fieldgram_ci/str
  str name=classnamesolr.DirectSolrSpellChecker/str
  str name=distanceMeasureinternal/str
  float name=accuracy0.5/float
  int name=maxEdits2/int
  int name=minPrefix0/int
  int name=maxInspections5/int
  int name=minQueryLength2/int
  float name=maxQueryFrequency0.9/float
  str name=comparatorClassfreq/str
/lst
/searchComponent

requestHandler name=/spell class=solr.SearchHandler startup=lazy
lst name=defaults
  str name=dfgram_ci/str
  str name=spellcheck.dictionarydefault/str
  str name=spellcheckon/str
  str name=spellcheck.extendedResultstrue/str
  str name=spellcheck.count25/str
  str name=spellcheck.onlyMorePopulartrue/str
  str name=spellcheck.maxResultsForSuggest1/str
  str name=spellcheck.alternativeTermCount25/str
  str name=spellcheck.collatetrue/str
  str name=spellcheck.maxCollations50/str
  str name=spellcheck.maxCollationTries50/str
  str name=spellcheck.collateExtendedResultstrue/str
/lst
arr name=last-components
  strspellcheck/str
/arr
  /requestHandler

*Schema.xml: *

field name=gram_ci type=textSpellCi indexed=true stored=true
multiValued=false/

/fieldTypefieldType name=textSpellCi class=solr.TextField
positionIncrementGap=100
   analyzer type=index
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
minShingleSize=2 outputUnigrams=true/
/analyzer
analyzer type=query
tokenizer class=solr.StandardTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.ShingleFilterFactory maxShingleSize=5
minShingleSize=2 outputUnigrams=true/
/analyzer
/fieldType

On Tue, Feb 10, 2015 at 1:23 AM, Dyer, James james.d...@ingramcontent.com
wrote:

 Nitin,

 My guess here is that your spellcheck field is a field that has stemming.
 This might be why you get a collation that return wind even though the
 user queried wnd and it does not get any suggestions.  Perhaps wnd is
 stemmed the same as wind ?  (Spellcheck usually works best if you
 copyField the query field to something that is tokenized but not heavily
 analyzed, and use the copy as the spellcheck dictionary.)

 The other problem might be because wind is in the index but you are not
 using spellcheck.alternativeTermCount.  If you set this to the same value
 as spellcheck.count, then it will give suggestions even when words exist
 in the index.

 By hits, it means if you replaced the q parameter on the original
 query but left everything else the same (filters, etc), this is how many
 results you would get.

 If you need more help, please include in your message the pertinent
 sections of solrconfig.xml, schema.xml and also the full query url you are
 using and the full spellcheck response.

 James Dyer
 Ingram Content Group


 -Original Message-
 From: Nitin Solanki [mailto:nitinml...@gmail.com]
 Sent: Monday, February 09, 2015 7:47 AM
 To: solr-user@lucene.apache.org
 Subject: Collations are not working fine.

 I am working on spell checking in Solr. I have implemented Suggestions and
 collations in my spell checker component.

 Most of the time collations work fine but in few case it fails.

 *Working*:
 I tried query:*gone wthh thes wnd*: In this wnd doesn't give suggestion
 wind but collation is coming right = gone with the wind, hits = 117


 *Not working:*
 But when I tried query: *gone wthh thes wint*: In this wint does give
 suggestion wind but collation is not coming right. Instead of gone with
 the wind it gives gone with the west, hits = 1.

 And I want to also know what is *hits* in collations.



Re: Collations are not working fine.

2015-02-09 Thread Nitin Solanki
Hi Bill Bell,
 Sorry, I don't know how to sort collation on hits. Will
you please assist me?

On Mon, Feb 9, 2015 at 9:11 PM, Bill Bell billnb...@gmail.com wrote:

 Can you order the collation a by highest to lowest hits ?

 Bill Bell
 Sent from mobile


  On Feb 9, 2015, at 6:47 AM, Nitin Solanki nitinml...@gmail.com wrote:
 
  I am working on spell checking in Solr. I have implemented Suggestions
 and
  collations in my spell checker component.
 
  Most of the time collations work fine but in few case it fails.
 
  *Working*:
  I tried query:*gone wthh thes wnd*: In this wnd doesn't give suggestion
  wind but collation is coming right = gone with the wind, hits = 117
 
 
  *Not working:*
  But when I tried query: *gone wthh thes wint*: In this wint does give
  suggestion wind but collation is not coming right. Instead of gone with
  the wind it gives gone with the west, hits = 1.
 
  And I want to also know what is *hits* in collations.



RE: Collations are not working fine.

2015-02-09 Thread Dyer, James
Nitin,

My guess here is that your spellcheck field is a field that has stemming.  This 
might be why you get a collation that return wind even though the user 
queried wnd and it does not get any suggestions.  Perhaps wnd is stemmed 
the same as wind ?  (Spellcheck usually works best if you copyField the 
query field to something that is tokenized but not heavily analyzed, and use 
the copy as the spellcheck dictionary.)

The other problem might be because wind is in the index but you are not using 
spellcheck.alternativeTermCount.  If you set this to the same value as 
spellcheck.count, then it will give suggestions even when words exist in the 
index.

By hits, it means if you replaced the q parameter on the original query but 
left everything else the same (filters, etc), this is how many results you 
would get.

If you need more help, please include in your message the pertinent sections of 
solrconfig.xml, schema.xml and also the full query url you are using and the 
full spellcheck response.

James Dyer
Ingram Content Group


-Original Message-
From: Nitin Solanki [mailto:nitinml...@gmail.com] 
Sent: Monday, February 09, 2015 7:47 AM
To: solr-user@lucene.apache.org
Subject: Collations are not working fine.

I am working on spell checking in Solr. I have implemented Suggestions and
collations in my spell checker component.

Most of the time collations work fine but in few case it fails.

*Working*:
I tried query:*gone wthh thes wnd*: In this wnd doesn't give suggestion
wind but collation is coming right = gone with the wind, hits = 117


*Not working:*
But when I tried query: *gone wthh thes wint*: In this wint does give
suggestion wind but collation is not coming right. Instead of gone with
the wind it gives gone with the west, hits = 1.

And I want to also know what is *hits* in collations.