Tomás, The 300+GB size is only inside the index.20110926152410 dir. Inside there are a lot of files. I am almost conviced that something is messed up like someone commited on this slave machine.
Thanks 2012/3/23 Tomás Fernández Löbbe <tomasflo...@gmail.com> > Alexandre, additionally to what Erick said, you may want to check in the > slave if what's 300+GB is the "data" directory or the "index.<timestamp>" > directory. > > On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <erickerick...@gmail.com > >wrote: > > > not really, unless perhaps you're issuing commits or optimizes > > on the _slave_ (which you should NOT do). > > > > Replication happens based on the version of the index on the master. > > True, it starts out as a timestamp, but then successive versions > > just have that number incremented. The version number > > in the index on the slave is compared against the one on the master, > > but the actual time (on the slave or master) is irrelevant. This is > > explicitly to avoid problems with time synching across > > machines/timezones/whataver.... > > > > It would be instructive to look at the admin/info page to see what > > the index version is on the master and slave. > > > > But, if you optimize or commit (I think) on the _slave_, you might > > change the timestamp and mess things up (although I'm reaching > > here, I don't know this for certain). > > > > What's the index look like on the slave as compared to the master? > > Are there just a bunch of files on the slave? Or a bunch of directories? > > > > Instead of re-indexing on the master, you could try to bring down the > > slave, blow away the entire index and start it back up. Since this is a > > production system, I'd only try this if I had more than one slave. > Although > > you could bring up a new slave and attach it to the master and see > > what happens there. You wouldn't affect production if you didn't point > > incoming requests at it... > > > > Best > > Erick > > > > On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <alel...@gmail.com> > > wrote: > > > Erick, > > > > > > We're using Solr 3.3 on Linux (CentOS 5.6). > > > The /data dir on master is actually 1.2G. > > > > > > I haven't tried to recreate the index yet. Since it's a production > > > environment, > > > I guess that I can stop replication and indexing and then recreate the > > > master index to see if it makes any difference. > > > > > > Also just noticed another thread here named "Simple Slave Replication > > > Question" that tells that it could be a problem if I'm seeing an > > > /data/index with an timestamp on the slave node. > > > Is this info relevant to this issue? > > > > > > Thanks, > > > Alexandre > > > > > > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson < > > erickerick...@gmail.com>wrote: > > > > > >> What version of Solr and what operating system? > > >> > > >> But regardless, this shouldn't be happening. Indexes can > > >> temporarily double in size, but any extras should be > > >> cleaned up relatively soon. > > >> > > >> On the master, what's the total size of the <solr home>/data > directory? > > >> I'm a little suspicious of the <backupAfter> on your master, but I > > >> don't think that's the root of your problem.... > > >> > > >> Are you recreating the index on the master (by deleting the > > >> index directory and starting over)? > > >> > > >> This is unusual, and I suspect it's something odd in your > configuration, > > >> but I confess I'm at a loss as to what. > > >> > > >> Best > > >> Erick > > >> > > >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <alel...@gmail.com> > > >> wrote: > > >> > Hello, > > >> > > > >> > We have a Solr index that has an average of 1.19 GB in size. > > >> > After configuring the replication, the slave machine is growing the > > index > > >> > size expoentially. > > >> > Currently we have an slave with 323.44 GB in size. > > >> > Is there anything that could cause this behavior? > > >> > The current replication config is below. > > >> > > > >> > Master: > > >> > <requestHandler name="/replication" class="solr.ReplicationHandler"> > > >> > <lst name="master"> > > >> > <str name="replicateAfter">commit</str> > > >> > <str name="replicateAfter">startup</str> > > >> > <str name="backupAfter">startup</str> > > >> > <str name="confFiles"> > > >> > > > >> > > > elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt > > >> > </str> > > >> > </lst> > > >> > </requestHandler> > > >> > > > >> > Slave: > > >> > <requestHandler name="/replication" class="solr.ReplicationHandler"> > > >> > <lst name="slave"> > > >> > <str name="masterUrl">http://master:8984/solr/Index/replication > </str> > > >> > </lst> > > >> > </requestHandler> > > >> > > > >> > Any pointers will be useful. > > >> > > > >> > Thanks, > > >> > Alexandre > > >> > > >