One for Michael, a search for - A Killer's Dream - brings the server to
it's knees.
Substring search is the killer, in particular with popular terms like
"a*" and "s*". Some statistics show me:
- there are 71543 search terms starting with "a" resulting in >16 Mio.
occurrences across 2.5 Mio. term<->document matches
- there are 54119 search terms starting with "s" resulting in >14 Mio.
occurrences across 3.3 Mio. term<->document matches
They represent #1 and #3 in document matches (#2 is "t").
If I disable substring search for these two characters ("a killer's
dream*" instead of "a* killer's* dream*"), the search is more than 10x
faster (2900ms -> <200ms).
--
Michael
_______________________________________________
unix mailing list
[email protected]
http://lists.slimdevices.com/mailman/listinfo/unix