On Mon, 2008-01-21 at 11:40 -0800, Michael Busch wrote: > what kind of queries are you using for your tests? (num query terms, > booleans clauses, phrases, wildcards?)
No numbers (at least not parsed as numbers), no ranges, some wildcards, some phrases. The only non-trivial part of the queries is that we use group expansion so that searches for title:something gets expanded to (title_type1:something or title_type2:something ...). The query-parsing takes about 20-40 seconds in total for 470.000 queries. The average number of hits/search is around 5000 with the classic long tail of a few searches giving millions of hits and a lot of searches giving hundreds of hits. Some of the queries are listed below. The majority of queries are simple with two or three words. Without any qualifiers, the query parser expands the entered strings to a search in 7 fields. kreditaftale* børns relationer lma_long:"bog" su_dk:"undersøgelser" detecting information content of corparate announcements using vaiance increases Three sisters author_normalised:"Kittelsen, Theodor" Cherry garden kultur fortællinger Salinger lma_long:"bog" llang:"eng" hornsleth kristian templars templars borgerskabet Civillproces uddannelse børn død su_dk:"sorg" su_dk:"dødsfald" su_dk:"erindringer" turen går til biblioteker integration plutarch fjender Danske præster og biskopper om kirkelige vielser af homoseksuelle pendul* author_normalised:"poe edgar allan" (china economy ) lma_long:"ebog" (foucault NOT pendul) lma_long:"tryktbog" --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]