[
https://issues.apache.org/jira/browse/SOLR-15859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17633397#comment-17633397
]
Shawn Heisey commented on SOLR-15859:
-------------------------------------
{quote}If you only want the live set, then I would use Map computations for
writes so that the {{hitCounter}} map is always kept consistent with the cache
(use {{Caffeine.evictionListener}} for removal in this case). Otherwise I would
wrap the cached value with additional metadata to avoid the extra map. In
either case, the global synchronized lock should not be used as that will
destroy the performance by synchronizing all operations.
{quote}
I figured that my synchronization additions wouldn't drastically alter
performance because Caffeine is probably already doing something similar
itself... I felt what I was adding to synchronization should be pretty fast and
not cause major issues.
But just in case I'm wrong about that, I have changed it a bit so the
synchronization is a just around my new code and will upload a new patch very
soon.
One thing that I would like to be able to do is get a value from the cache
without incrementing the hitcounter. I need that because my new handler gets
the number of rows from the cache entry by asking for the cache entry (which
actually does increment the global hitcounter, while Solr's own usage of the
cache does not for some reason) and then getting the size of the DocSet. I
can't do this in CaffeineCache itself, because I can't call size() on the "V"
generic type, and I do not want to introduce any type-specific code into it.
I would be interested in learning how to do the wrapping that you mentioned.
I'm all for reducing the memory requirement. The potential for needing a lot
of additional memory is why I introduced the extraStats config option ... so
this new metadata will only be there when the admin specifically asks for it.
> Add handler to dump filter cache
> --------------------------------
>
> Key: SOLR-15859
> URL: https://issues.apache.org/jira/browse/SOLR-15859
> Project: Solr
> Issue Type: Improvement
> Reporter: Andy Lester
> Assignee: Shawn Heisey
> Priority: Major
> Labels: FQ, cache, filtercache, metrics
> Attachments: cacheinfo.patch, fix_92_startup.patch
>
>
> It would be very helpful to be able to inspect the contents of the
> filterCache.
> I'd like to be able to query something like
> {{/admin/caches?type=filter&nentries=1000&sort=numHits+DESC}}
> nentries would be allowed to be -1 to get everything.
> It would be nice to see these data items for each entry. I don't know which
> are available, but I'm thinking blue sky here:
> * cache key, exactly as stored
> * Timestamp when the entry was inserted
> * Whether the insertion of the entry evicted another entry, and if so which
> one
> * Timestamp of when this entry was last hit
> * Number of hits on this entry forever
> * Number of hits on this entry over some time period
> * Number of documents matched by the filter
> * Number of bytes of memory used by the filter
> These are the sorts of questions I'd like to be able answer:
> * "I just did a query that I expect will have added a cache entry. Did it?"
> * "Are my queries hitting existing cache entries?"
> * "How big should I set my filterCache size? Should I limit it by number of
> entries or RAM usage?"
> * "Which of my FQs are getting used the most? These are the ones I want in
> my firstSearcher queries." (I currently determine this by processing my old
> solr logs)
> * "Which filters give me the most bang for the buck in terms of RAM usage?"
> * "I have filter X and filter Y, but would it be beneficial if I made a
> filter X AND Y?"
> * "Which FQs are used more at certain times of the day? (Assuming I take
> regular snapshots throughout the day)"
> I imagine a response might look like:
> {{{}}
> {{ "responseHeader": {}}
> {{ "status": 0,}}
> {{ "QTime": 961}}
> {{ },}}
> {{ "response": {}}
> {{ "numFound": 12104,}}
> {{ "filterCacheKeys": {}}
> {{ [}}
> {{ "language:eng": {}}
> {{ "inserted": "2021-12-04T07:34:16Z",}}
> {{ "lastHit": "2021-12-04T18:17:43Z",}}
> {{ "numHits": 15065,}}
> {{ "numHitsInPastHour": 2319,}}
> {{ "evictedKey": "agelevel:4 shippable:Y",}}
> {{ "numRecordsMatchedByFilter": 24328753,}}
> {{ "bytesUsed": 3041094}}
> {{ }}}
> {{ ],}}
> {{ [}}
> {{ "is_set:N": {}}
> {{ ...}}
> {{ }}}
> {{ ],}}
> {{ [}}
> {{ "language:spa": {}}
> {{ ...}}
> {{ }}}
> {{ ]}}
> {{ }}}
> {{}}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]