But the entirety of the old indexes (no longer on disk) wasn't cached in 
memory, right?  Or is it?  Maybe this is me not understanding lucene enough. I 
thought that portions of the index were cached in disk, but that sometimes the 
index reader still has to go to disk to get things that aren't currently in 
caches.  If this is true (tell me if it's not!), we have an index reader that 
was based on indexes that... are no longer on disk. But the index reader is 
still open. What happens when it has to go to disk for info?

And the second replication will trigger a commit even if there are in fact no 
new files to be transfered over to slave, because there have been no changes 
since the prior sync with failed commit?
________________________________________
From: Upayavira [...@odoko.co.uk]
Sent: Tuesday, December 14, 2010 2:23 AM
To: solr-user@lucene.apache.org
Subject: RE: OutOfMemory GC: GC overhead limit exceeded - Why isn't WeakHashMap 
getting collected?

The second commit will bring in all changes, from both syncs.

Think of the sync part as a glorified rsync of files on disk. So the
files will have been copied to disk, but the in memory index on the
slave will not have noticed that those files have changed. The commit is
intended to remedy that - it causes a new index reader to be created,
based upon the new on disk files, which will include updates from both
syncs.

Upayavira

On Mon, 13 Dec 2010 23:11 -0500, "Jonathan Rochkind" <rochk...@jhu.edu>
wrote:
> Sorry, I guess I don't understand the details of replication enough.
>
> So slave tries to replicate. It pulls down the new index files. It tries
> to do a commit but fails.  But "the next commit that does succeed will
> have all the updates." Since it's a slave, it doesn't get any commits of
> it's own. But then some amount of time later, it does another replication
> pull. There are at this time maybe no _new_ changes since the last failed
> replication pull. Does this trigger a commit that will get those previous
> changes actually added to the slave?
>
> In the meantime, between commits.. are those potentially large pulled new
> index files sitting around somewhere but not replacing the old slave
> index files, doubling disk space for those files?
>
> Thanks for any clarification.
>
> Jonathan
> ________________________________________
> From: ysee...@gmail.com [ysee...@gmail.com] On Behalf Of Yonik Seeley
> [yo...@lucidimagination.com]
> Sent: Monday, December 13, 2010 10:41 PM
> To: solr-user@lucene.apache.org
> Subject: Re: OutOfMemory GC: GC overhead limit exceeded - Why isn't
> WeakHashMap getting collected?
>
> On Mon, Dec 13, 2010 at 9:27 PM, Jonathan Rochkind <rochk...@jhu.edu>
> wrote:
> > Yonik, how will maxWarmingSearchers in this scenario effect replication?  
> > If a slave is pulling down new indexes so quickly that the warming 
> > searchers would ordinarily pile up, but maxWarmingSearchers is set to 1.... 
> > what happens?
>
> Like any other commits, this will limit the number of searchers
> warming in the background to 1.  If a commit is called, and that tries
> to open a new searcher while another is already warming, it will fail.
>  The next commit that does succeed will have all the updates though.
>
> Today, this maxWarmingSearchers check is done after the writer has
> closed and before a new searcher is opened... so calling commit too
> often won't affect searching, but it will currently affect indexing
> speed (since the IndexWriter is constantly being closed/flushed).
>
> -Yonik
> http://www.lucidimagination.com
>

Reply via email to