Had this typed up yesterday and forgot to send. "Is there no way to ensure that the top level filter caches are not expunged when some documents are added to the index and have the changes available at the same time?"
no. And it's not something that you can do without major architectural changes. When you commit, background merging kicks in which will renumber the _internal_ Lucene document ID. This ID ranges 0-maxDoc and is used as the bit to set in the filterCache object. So if you preserved the filterCache, the bits will be wrong. The queryResultCache is "If that is the case, then do I need to always have to rely on warmup of caches to get some documents in caches?" Yes, that's exactly what the "autowarm" feature is on the caches. Also the newSearcher event can be used to hand-craft warmup searches where you know certain things about the index and you specifically want to ensure certain warming. Please start out with modest numbers for autowarm, as in 20-30. It's very often the case that you don't need much more than that. What those numbers do in filterCache and queryResultCache is re-execute the associated fq or q clause, respectively. "Are there any other approaches then warmup which folks usually do to avoid this; if they want to build a fast searchable product and having some write throughput as well?" and " I can't afford to get my cached flushed". What evidence do you have for this last statement? "Currently I do commits via my indexing application (after every batch of documents)" Please, please, please do _not_ do this. It's especially egregious because you do it after every batch of docs. So rather than flushing your caches every 5 minutes (say), you hammer Solr with commit after commit after commit. Configure your soft commit interval to your latency requirements and forget about it. Or just configure hard commit with openSearcher set to true. Or perhaps even just specify commitWithin when you send docs to Solr. At a guess you may have seen warnings about "too many on deck searchers" if your commit interval ls shorter than your autowarm time. I'll bend a little bit if the client only issues a commit at the very end of the run and there's precisely one client running at a time and you can _guarantee_ there's only one commit, but it's usually much easier and more reliable to use the solr config settings. Perhaps you're not entirely familiar with how openSearcher works, so here's a brief review. This applies to either hard commit (openSearcher=true) or soft commit. 1> a commit happens 2> a new searcher is being opened and autowarming kicks off 3> incoming searches are served by the _old_ searcher, using all the _old_ caches. 4> autowarming completes 5a> incoming requests are routed to the new searcher 5b> the old searcher finishes serving the outstanding requests received before <4> and closes 6> the old caches are flushed. So having high read throughput On Tue, Apr 24, 2018 at 10:36 AM, Lee Carroll <lee.a.carr...@googlemail.com> wrote: > From memory try the following: > Don't manually commit from client after batch indexing > set soft commit to be a a long time interval. As long as acceptable to run > stale, say 5 mins or longer if you can. > set hard commit to be short (seconds ) to keep everything neat and tidy > regards updates and avoid backing up log files > set opensearcher=false > > I'm pretty sure that works for at least one of our indices. It's worth a go. > > Lee C > > On 24 April 2018 at 06:56, Papa Pappu <tuhaipa...@gmail.com> wrote: > >> Hi, >> I've written down my query over stack-overflow. Here is the link for that : >> https://stackoverflow.com/questions/49993681/preventing- >> solr-cache-flush-when-commiting >> >> In short, I am facing troubles maintaining my solr caches when commits >> happen and the question provides detailed description of the same. >> >> Based on my use-case if someone can recommend what settings I should use or >> practices I should follow it'll be really helpful. >> >> Thanks and regards, >> Dmitri >>