On 10/20/06 11:17 AM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:

> That definitely seems doable.
> How big is your index?

Currently, the index is 65,000 movies plus actors and directors. A pretty
small corpus.

> What's the form of your queries (AND, or sloppy phrase queries I'd imagine?)

We'll start with "OR", because I think an all-terms default is a really
bad idea. If someone is searches for "X-Men 3: The Final Battle", we
need to show them "X-Men 3: The Last Stand".

We'll need some sort of fuzzy matching and sloppy phrases. You should
see the misspellings for "Napoleon Dynamite" ("NEPOLINIAN DYNOMITE").

> If this is for netflix (and isn't confidential), are you just
> searching across DVD info/description, or in customer comments too?

We'll start with the basics and test other things. We are always
testing something new.

> If it is DVD's you're searching, that can't be a large collection, and
> you should be in really good shape.  You might even try indexing
> things in separate fields and searching across all those fields while
> assigning boosts separately... it should be fast enough.  You might
> also check out the dismax handler if you haven't yet.
> Any future plans for utilizing the faceted search?

We have a well-developed browsing design, so I'd rather not mix
facets in with that. Two other things work against using facets:
most of our queries are known-item searches, and I think that
facets work best when there is very broad agreement on the categories.
For example, clothing and food work well, but the subject facets
at Barnes and Noble don't help at all.

I have not checked out dismax.

wunder
-- 
Walter Underwood
Search Guru, Netflix


Reply via email to