On 14. Jan 2007, at 2:40 , Mark Miller wrote:
First, have you looked at SwarmCache? Cluster aware caching for
java...
No, I haven't come across that one. I'll take a look, thanks!
As a matter of fact, we do have a network-wide caching mechanism, so
that's what we use.
Second...does it matter that you cannot share the same cache across
multiple servers? How about a separate cache on each server? When a
request hits a particular server for the first time it builds the
filter and caches it. I do a lot of filter caching that way with
EHcache.
I currently cache it in each server (just by using a Map<Integer,
Filter> type of dumb cache). This works fine, but I'm concerned about
production
use. The problem is that having the first cache miss is really
hurting us already (and you can't avoid the first, as you have to
calculate it at least once...) but going through it a second time on
a second server is distastrous. In our application there are
thousands of concurrent users querying the database interactively,
and as it is used over the web, this has to be fast. Really fast. In
general I require a sub-second response time. Calculating the filter
currently takes anywhere between 0.5 - 40 seconds, depending on the
user that makes the query. When he is paging, we probably will just
rexecute the search instead of caching that, but we might not end up
doing the search on the same Lucene server. Having the delay on the
first query is bad. Potentially having it on the following pages,
too, is not going to work. We might end up with 10 or more Lucene
servers, so ending up on a different server for each page isn't that
unlikely.
There must be some way to cache that filter...
My ever recurring thought over the last couple of days...;)
The reasoning I put forth in my first mail lead me to the insight
that we cannot cache the filter the way Lucene is implementing them
right now. Even when caching the filter, the document ids may change.
This would lead to potentially wrong results, because my filter cache
is now stale (filtering the wrong documents). This I must avoid,
naturally. Lucene is giving me a cache practically for free already,
but I find its ever changing document ids a problem.
What I'd really like to do is to be able to assign documents ids, and
have them stay at that position. The fact that I have to query for a
field, that I moreover know to be unique, is a huge bottleneck. Ok, I
understand that this isn't an issue for fulltext search and most of
the applications we are using it for don't have that id requirement,
but having cheap, direct access to the documents in a externally
cacheable manner appeals to me. I'm sure there are others that have
been bitten by this, too, and I'm willing to invest some time into
the implementation. I'm just not familiar enough with the codebase to
just begin hacking. Also, of course I might be missing something
crucial here, that makes my problem a non-issue (which I would
prefer :))
Thanks,
Kay
--
Kay Röpke
http://classdump.org/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]