Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]

via GitHub Thu, 25 Apr 2024 13:44:59 -0700


kotman12 commented on PR #2382:
URL: https://github.com/apache/solr/pull/2382#issuecomment-2078142747


   > So I was trying to learn how the main configuration bits fit together here 
and high-level the reverse search idea and my _solr-monitor-naive-dinner-demo_ 
branch (or #2421 diff) off this pull request's branch is a side effect of that 
and my understanding so far based on it is that:
   > 
   > * the in-memory state is in the `Presearcher` object in the 
`ReverseQueryParserPlugin` class object (and in the 
_solr-monitor-naive-dinner-demo_ i just used a simple `Monitor` object instead 
of the `Presearcher` object)
   > * the state is updated via the `MonitorUpdateRequestProcessor` i.e. saved 
searches are added as `MonitorQuery` objects to the `Monitor` object (and 
updating of the `Presearcher` object is a bit different)
   > * the state is accessed via the `ReverseSearchComponent` component 
(currently non-distributed but conceptually distributed would work too?)
   > 
   > Is that basic understanding correct? As a next step I might go learn more 
about the `Presearcher` itself.
   
   I'll give the PR a look but when I first looked at this my main concerns 
wiring a Monitor straight into solr were:
   
   1. Handling commit/rollback and what to update the tlog with if you also 
writing to a "sidecar" monitor object?
   2. Handling persistence. Currently the Monitor has its own tightly sealed 
index. It can be configured for persistence but if you want to peek at the 
segments a monitor is writing to disk it might not be easy, especially to 
handle configurations like tlog+pull. The alternative is to use only the 
in-memory Monitor configurations but that has limitations and takes away 
precious resources from the {cacheId -> deserialized query} cache.
   3. Bringing me to my final point that the cache a Monitor object wraps is a 
simple concurrentHashMap which is updated with a very coarse-grained lock that 
can block reads for  a long time. It just doesn't feel like it "jives" with the 
solr approach to concurrency that is much more sophisticated (it is a fully 
fledged db after all). We could make the Monitor cache more configurable in the 
upstream lucene monitor repo but in my opinion lucene monitor tries to do too 
much state-management that its not that good at but the most valuable thing to 
take advantage of is the sophisticated reverse search methods (query 
decomposition for faster matching, query tokenization for pre-search, term 
weighting, optimized document-to-query conversion with term-acceptor, and 
probably something else I am forgetting).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] SOLR-4587: integrate lucene-monitor into solr [solr]

Reply via email to