On 10/20/06 11:17 AM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote:
> That definitely seems doable. > How big is your index? Currently, the index is 65,000 movies plus actors and directors. A pretty small corpus. > What's the form of your queries (AND, or sloppy phrase queries I'd imagine?) We'll start with "OR", because I think an all-terms default is a really bad idea. If someone is searches for "X-Men 3: The Final Battle", we need to show them "X-Men 3: The Last Stand". We'll need some sort of fuzzy matching and sloppy phrases. You should see the misspellings for "Napoleon Dynamite" ("NEPOLINIAN DYNOMITE"). > If this is for netflix (and isn't confidential), are you just > searching across DVD info/description, or in customer comments too? We'll start with the basics and test other things. We are always testing something new. > If it is DVD's you're searching, that can't be a large collection, and > you should be in really good shape. You might even try indexing > things in separate fields and searching across all those fields while > assigning boosts separately... it should be fast enough. You might > also check out the dismax handler if you haven't yet. > Any future plans for utilizing the faceted search? We have a well-developed browsing design, so I'd rather not mix facets in with that. Two other things work against using facets: most of our queries are known-item searches, and I think that facets work best when there is very broad agreement on the categories. For example, clothing and food work well, but the subject facets at Barnes and Noble don't help at all. I have not checked out dismax. wunder -- Walter Underwood Search Guru, Netflix