On 10/6/2009 20:42, Mike Bush wrote: > 2009/6/10 Evan Daniel<evanbd at gmail.com>: >> On Wed, Jun 10, 2009 at 6:49 AM, Mike Bush<mpbush at gmail.com> wrote: >>> XMLLibrarian doesn't currently support searching for phrases or rating >>> relevance of results based on proximity so I don't think common words >>> could be of any use in searches now. >>> >>> Also, I'm not sure but I think the current index doesn't include words >>> under 4 letters at all. >> If you read my previous mails, you'll see that the the spider is in >> fact indexing the word "the". >> > > Yes sorry, Ive since searched for 'who' on wanna and it is there, it > gave me OutOfMemoryException trying to generate the results page >
You have get it :) This is yet another reason to split the <site> part out. In which we may keep in memory the siteId only, not the whole uri, before the union. Even so, I suspect searching words like "the who" will ever work without on disk temp files. >> Evan Daniel >>