[
https://issues.apache.org/jira/browse/SOLR-9658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904200#comment-16904200
]
Hoss Man commented on SOLR-9658:
--------------------------------
* i should have noticed/mentioned this in the last patch but: any method
(including your new {{markAndSweepByIdleTime()}} that that expects to be called
only when markAndSweepLock is already held should really start with {{assert
markAndSweepLock.isHeldByCurrentThread();}}
* this patch still seems to modify TestJavaBinCodec unneccessarily? (now that
you re-added the backcompat constructor)
* i don't really think it's a good idea to add these {{CacheListener}} /
{{EvictionListener}} APIs at this point w/o a lot more consideration of their
lifecycle / usage
** I know you introduced them in response to my suggestion to add hooks for
monitoring in tests, but they don't _currently_ seem more useful in the tests
then some of the specific suggestions i made before (more comments on this
below) and the APIs don't seem to be thought through enough to be generally
useful later w/o a lot of re-working...
*** Examples: if the point of creating {{CacheListener}} now is to be able to
add more methods/hooks to it later, then is why is only {{EvictionListener}}
passed down to the {{ConcurrentXXXCache}} impls instead of the entire
{{CacheListener}} ?
*** And why are there 2 distinct {{EvictionListener}} interfaces, instead of
just a common one?
** ... so it would probably be safer/cleaner to avoid adding these APIs now
since there are simpler alternatives available for the tests?
* Re: "...plus adding support for artificially "advancing" the time" ... this
seems overly complex?
** None of the suggestions i made for improving the reliability/coverage of
the test require needing to fake the "now" clock: just being able to insert
synthetic entries into the cache with artifically old timestamps – which could
be done by refactoring out the middle of {{put(...)}} into a new
{{putCacheEntry(CacheEntry ... )}} method that would let the (test) caller set
an arbitrary {{lastAccessed}} value...
{code:java}
/**
* Useable by tests to create synthetic cache entries, also called by {@link
#put}
* @lucene.internal
*/
public CacheEntry<K,V> putCacheEntry(CacheEntry<K,V> e) {
CacheEntry<K,V> oldCacheEntry = map.put(key, e);
int currentSize;
if (oldCacheEntry == null) {
currentSize = stats.size.incrementAndGet();
ramBytes.addAndGet(e.ramBytesUsed() + HASHTABLE_RAM_BYTES_PER_ENTRY); //
added key + value + entry
} else {
currentSize = stats.size.get();
ramBytes.addAndGet(-oldCacheEntry.ramBytesUsed());
ramBytes.addAndGet(e.ramBytesUsed());
}
if (islive) {
stats.putCounter.increment();
} else {
stats.nonLivePutCounter.increment();
}
return oldCacheEntry;
}
{code}
** ...that way tests could "setup" a cache containing arbitrary entries (of
arbitrary size, with arbitrary create/access times that could be from weeks in
the past) and then very precisely inspect the results of the cache after
calling {{markAndSweep()}}
*** or some other new {{triggerCleanupIfNeeded()}} method that can encapsualte
all of the existing {{// Check if we need to clear out old entries from the
cache ...}} logic currently at the end of {{put()}}
* In general, i really think testing of functionality like this should really
focus on testing "what exactly happens when markAndSeep() is called on a cache
containing a very specific set of values?" indepdnent from "does markAndSweep()
get called eventually & automatically if i configure maxIdleTime?"
** the former can be tested w/o the need of any cleanup threads or faking the
TimeSource
** the later can be tested w/o the need of a {{CacheListener}} or
{{EvictionListener}} API (or a fake TimeSource) – just create an anonymous
subclass of {{ConcurrentXXXCache}} who'se markAndSweep() method decrements a
CountDownLatch tha the test tread is waiting on
** isolating the testing of these different concepts not only makes it easier
to test more complex aspects of how {{markAndSweep()}} is expected to work (ie:
"assert exactly which entries are removed if the sum of the sizes == X ==
(ramUpperWatermark + Y) but the two smallest entries (whose total size = Y + 1)
are the only one with an accessTime older then the idleTime") but also makes it
easier to understand & debug failures down the road -if- _when_ they happen.
*** as things stand in your patch, -if- _when_ the "did not evict entries in
time" assert (eventually) trips in a future jenkins build, we won't immediately
be able to tell (w/o added logging) if that's because of a bug in the
{{CleanupThread}} that prevented if from calling {{markAndSweep();}} or a bug
in {{SimTimeSource.advanceMs()}} ; or a bug somewhere in the cache that
prevented {{markAndSweep()}} from recognizing those entries were old; or just a
heavily loaded VM CPU that never got around to scheduling the {{CleanupThread}}
after it's {{await(maxIdle)}} returned. (that last one is hard to definitively
rule out either way, but in a test where we set maxIdleTime="1 second", and use
latch.await("2 minutes"), we could rule it out with a high degree of confidence
– but a 1 second max idle time makes it hard to reliably test anything _else_
interesting on a machine that slow about what items are evicted, which is where
isolating the concepts being tested is really helpful)
> Caches should have an optional way to clean if idle for 'x' mins
> ----------------------------------------------------------------
>
> Key: SOLR-9658
> URL: https://issues.apache.org/jira/browse/SOLR-9658
> Project: Solr
> Issue Type: New Feature
> Reporter: Noble Paul
> Assignee: Andrzej Bialecki
> Priority: Major
> Fix For: 8.3
>
> Attachments: SOLR-9658.patch, SOLR-9658.patch, SOLR-9658.patch,
> SOLR-9658.patch
>
>
> If a cache is idle for long, it consumes precious memory. It should be
> configurable to clear the cache if it was not accessed for 'x' secs. The
> cache configuration can have an extra config {{maxIdleTime}} . if we wish it
> to the cleaned after 10 mins of inactivity set it to {{maxIdleTime=600}}.
> [~dragonsinth] would it be a solution for the memory leak you mentioned?
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]