Alessandro,

The "spellcheck.collate" feature already supports this by specifying 
"spellcheck.maxCollationTries" greater than zero.  This is useful both to 
prevent unauthorized access to data and also to guarantee that suggested 
collations will return some results.

But "maxCollationTries" accomplishes this by running the proposed collation 
queries against the index.  If you are interested in preventing unauthorized 
access only, then you can probably get better performance with a lower-level 
filter on the term level.  There is currently no way to filter the single-term 
suggestions.

I could see this as a nice enhancement, but given the current 
"maxCollationTries" support, it may have a pretty narrow use-case.

I've also thought about moving all the collate functionality to the Lucene 
level, so that clients other than Solr can take advantage of it.  Perhaps 
something along the lines of your proposal could be a work in that direction?

James Dyer
Ingram Content Group
(615) 213-4311

From: Alessandro Benedetti [mailto:[email protected]]
Sent: Wednesday, January 15, 2014 11:53 AM
To: [email protected]
Subject: Re: [Apache Solr] Filter query Suggester and Spellchecker

No one? guys ?

2014/1/14 Alessandro Benedetti 
<[email protected]<mailto:[email protected]>>
Hi guys,
this proposal will be for an improvement.
I propose to add the chance of suggest terms ( for Spellchecking and Auto 
Suggest) based only to a subset of Documents.

In this way we can provide security implementations that will allow users to 
see suggestions of terms , only from allowed to see documents.

These are the proposed approaches :

Filter query Auto Suggest

1) retrieve the suggested tokens from the input text using the already cutting 
edge FST based suggester
2) use a similar approach of the TermEnum because
a) we have a small set of suggestions ( reasonable, because we can filter to 
5-10 suggestions max)
So the termEnum approach will be fast.
b) we can get for each suggested token the posting list and make the 
intersection with the resulting DocId list ( from the filter query), if null, 
not return the suggestion.

Filter query Spellcheck

1) we can use the already cutting edge FSA based direct index spellchecker and 
get the suggestions
2) use a similar approach of the TermEnum because
a) we have a small set of suggestions ( reasonable, because we can filter to 
5-10 suggestions max)
So the termEnum approach will be fast.
b) we can get for each suggested token the posting list and make the 
intersection with the resulting DocId list ( from the filter query), if null, 
not return the suggestion.

Of course we will have to add a further parameter in the request handler, 
something like :
spellcheck.qf

Let me know your impression and ideas,

Cheers





--
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England



--
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Reply via email to