Thanks Tomás, I'll take a look. Still interested to hear from anyone about using queries to populate the list - I'm willing to give up a bit of performance for the flexibility it would provide.
On Thu, Jan 16, 2014 at 1:06 PM, Tomás Fernández Löbbe < tomasflo...@gmail.com> wrote: > I think your use case is the one described in LUCENE-5350, maybe you want > to take a look to the patch and comments there. > > Tomás > > > On Wed, Jan 15, 2014 at 12:58 PM, Hamish Campbell < > hamish.campb...@koordinates.com> wrote: > > > Hi all, > > > > I'm looking into options for filtering the search suggestions dictionary. > > > > Using Solr 4.6.0, Suggester component and fst.FuzzyLookupFactory using a > > field based dictionary, we're indexing records for a multi-tenanted SaaS > > platform. SearchHandler records are always filtered by the particular > > client warehouse (e.g. by domain), however we need a way to apply a > similar > > filter to the spell check dictionary to prevent leaking terms between > > clients. In other words: when client A searches for a document title they > > should not receive spelling suggestions for client B's document titles. > > > > This has been asked a couple of times, on the mailing list and on > > StackOverflow. Some of the suggested approaches: > > > > 1. Use dynamic fields to create dictionaries per-warehouse (mentioned > here: > > > > > http://lucene.472066.n3.nabble.com/Filtering-down-terms-in-suggest-tt4069627.html > > ) > > > > That might be a reasonable option for us (we already considered a similar > > approach), but at what point does this stop scaling efficiently? How many > > dynamic fields are too many? > > > > 2. Run a query to populate the suggestion list (also mentioned in that > > thread) > > > > If I understand this correctly, this would give us a lot of flexibility > and > > power: for example to give a more nuanced result set using the users > > permissions to expose private documents in their spelling suggestions. > > > > I expect this would be a slow query, but our total document count is > > currently relatively small (on the order of 10^3 objects) and I imagine > you > > could create a specific word index with the appropriate fields to keep > this > > in check. Is this a feasible approach, and if so, how do you build a > > dynamic suggestion list? > > > > 3. Other options: > > > > It seems like this is a common problem - and we could through some > > resources at building an extension to provide some limited suggestion > > dictionary filtering. Is anyone already doing something similar, or has > > found a clever hack around this, or can suggest a starting point? > > > > Thanks everyone! > > > > -- > > Hamish Campbell > > Koordinates Ltd <http://koordinates.com/?_bzhc=esig> > > PH +64 9 966 0433 > > FAX +64 9 966 0045 > > > -- Hamish Campbell Koordinates Ltd <http://koordinates.com/?_bzhc=esig> PH +64 9 966 0433 FAX +64 9 966 0045