Re: Making document numbers persistent

Kay Roepke Sat, 13 Jan 2007 17:59:48 -0800


On 14. Jan 2007, at 2:40 , Mark Miller wrote:

First, have you looked at SwarmCache? Cluster aware caching forjava...


No, I haven't come across that one. I'll take a look, thanks!

As a matter of fact, we do have a network-wide caching mechanism, sothat's what we use.

Second...does it matter that you cannot share the same cache acrossmultiple servers? How about a separate cache on each server? When arequest hits a particular server for the first time it builds thefilter and caches it. I do a lot of filter caching that way withEHcache.

I currently cache it in each server (just by using a Map<Integer,Filter> type of dumb cache). This works fine, but I'm concerned aboutproductionuse. The problem is that having the first cache miss is reallyhurting us already (and you can't avoid the first, as you have tocalculate it at least once...) but going through it a second time ona second server is distastrous. In our application there arethousands of concurrent users querying the database interactively,and as it is used over the web, this has to be fast. Really fast. Ingeneral I require a sub-second response time. Calculating the filtercurrently takes anywhere between 0.5 - 40 seconds, depending on theuser that makes the query. When he is paging, we probably will justrexecute the search instead of caching that, but we might not end updoing the search on the same Lucene server. Having the delay on thefirst query is bad. Potentially having it on the following pages,too, is not going to work. We might end up with 10 or more Luceneservers, so ending up on a different server for each page isn't thatunlikely.

There must be some way to cache that filter...


My ever recurring thought over the last couple of days...;)

The reasoning I put forth in my first mail lead me to the insightthat we cannot cache the filter the way Lucene is implementing themright now. Even when caching the filter, the document ids may change.This would lead to potentially wrong results, because my filter cacheis now stale (filtering the wrong documents). This I must avoid,naturally. Lucene is giving me a cache practically for free already,but I find its ever changing document ids a problem.What I'd really like to do is to be able to assign documents ids, andhave them stay at that position. The fact that I have to query for afield, that I moreover know to be unique, is a huge bottleneck. Ok, Iunderstand that this isn't an issue for fulltext search and most ofthe applications we are using it for don't have that id requirement,but having cheap, direct access to the documents in a externallycacheable manner appeals to me. I'm sure there are others that havebeen bitten by this, too, and I'm willing to invest some time intothe implementation. I'm just not familiar enough with the codebase tojust begin hacking. Also, of course I might be missing somethingcrucial here, that makes my problem a non-issue (which I wouldprefer :))


Thanks,

Kay

--
Kay Röpke
http://classdump.org/





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Making document numbers persistent

Reply via email to