RE: Spellchecking suggestions won't collate

2014-08-20 Thread Dyer, James
Because my is the 7th suggestion down the list, it is going to need more than 
30 tries to figure out the one that can give some hits.  You can increase 
maxCollationTries if you're willing to endure the performance penalty of 
trying so many replacement queries.  This case actually highlights why 
DirecrSpellChecker by default doesn't even bother with short words like this.

Rather than letting the spellchecker check words this small, possibly you can 
just scan the user's input and make any words 4 characters long to be 
optional?  Or even just use a mm below 100%? (65% ?)  I realize this will give 
you a small loss of precision but the recall will be better and you'll have to 
rely less on spellcheck.  

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Corey Gerhardt [mailto:corey.gerha...@directwest.com] 
Sent: Friday, August 15, 2014 3:21 PM
To: Solr User List
Subject: Spellchecking suggestions won't collate

It must be Friday. I can't figure out why there is no collation value:

{
  responseHeader:{
status:0,
QTime:31,
params:{
  spellcheck:on,
  spellcheck.collateParam.qf:BUS_BUSINESS_NAME,
  spellcheck.maxResultsForSuggest:5,
  spellcheck.maxCollations:3,
  spellcheck.maxCollationTries:30,
  qf:BUS_BUSINESS_NAME_PHRASE,
  q.alt:*:*,
  spellcheck.collate:true,
  spellcheck.onlyMorePopular:false,
  defType:edismax,
  debugQuery:true,
  echoParams:all,
  spellcheck.count:10,
  spellcheck.alternativeTermCount:10,
  indent:true,
  q:Mi Next Promo,
  wt:json}},
  response:{numFound:0,start:0,maxScore:0.0,docs:[]
  },
  spellcheck:{
suggestions:[
  mi,{
numFound:10,
startOffset:0,
endOffset:2,
suggestion:[mr,
  mp,
  mid,
  mix,
  mb,
  mj,
  my,
  md,
  mc,
  ma]},
  next,{
numFound:3,
startOffset:3,
endOffset:7,
suggestion:[nest,
  news,
  neil]},
  promo,{
numFound:4,
startOffset:8,
endOffset:13,
suggestion:[photo,
  prime,
  pronto,
  prof]}]},

The actual business name is My Next Promo which I'm hoping would be the 
collation value.

Thanks,

Corey



RE: Spellchecking suggestions won't collate

2014-08-20 Thread Corey Gerhardt
I'm working with business names which are even sometimes people names such as  
Wardell F E B Dr .  I suspect I need to change my logic to not try to rely on 
spellchecking so much as you suggest.

Thanks.

Corey

-Original Message-
From: Dyer, James [mailto:james.d...@ingramcontent.com] 
Sent: August-20-14 9:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Spellchecking suggestions won't collate

Because my is the 7th suggestion down the list, it is going to need more than 
30 tries to figure out the one that can give some hits.  You can increase 
maxCollationTries if you're willing to endure the performance penalty of 
trying so many replacement queries.  This case actually highlights why 
DirecrSpellChecker by default doesn't even bother with short words like this.

Rather than letting the spellchecker check words this small, possibly you can 
just scan the user's input and make any words 4 characters long to be 
optional?  Or even just use a mm below 100%? (65% ?)  I realize this will give 
you a small loss of precision but the recall will be better and you'll have to 
rely less on spellcheck.  

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Corey Gerhardt [mailto:corey.gerha...@directwest.com] 
Sent: Friday, August 15, 2014 3:21 PM
To: Solr User List
Subject: Spellchecking suggestions won't collate

It must be Friday. I can't figure out why there is no collation value:

{
  responseHeader:{
status:0,
QTime:31,
params:{
  spellcheck:on,
  spellcheck.collateParam.qf:BUS_BUSINESS_NAME,
  spellcheck.maxResultsForSuggest:5,
  spellcheck.maxCollations:3,
  spellcheck.maxCollationTries:30,
  qf:BUS_BUSINESS_NAME_PHRASE,
  q.alt:*:*,
  spellcheck.collate:true,
  spellcheck.onlyMorePopular:false,
  defType:edismax,
  debugQuery:true,
  echoParams:all,
  spellcheck.count:10,
  spellcheck.alternativeTermCount:10,
  indent:true,
  q:Mi Next Promo,
  wt:json}},
  response:{numFound:0,start:0,maxScore:0.0,docs:[]
  },
  spellcheck:{
suggestions:[
  mi,{
numFound:10,
startOffset:0,
endOffset:2,
suggestion:[mr,
  mp,
  mid,
  mix,
  mb,
  mj,
  my,
  md,
  mc,
  ma]},
  next,{
numFound:3,
startOffset:3,
endOffset:7,
suggestion:[nest,
  news,
  neil]},
  promo,{
numFound:4,
startOffset:8,
endOffset:13,
suggestion:[photo,
  prime,
  pronto,
  prof]}]},

The actual business name is My Next Promo which I'm hoping would be the 
collation value.

Thanks,

Corey