Alexandre, additionally to what Erick said, you may want to check in the slave if what's 300+GB is the "data" directory or the "index.<timestamp>" directory.
On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <erickerick...@gmail.com>wrote: > not really, unless perhaps you're issuing commits or optimizes > on the _slave_ (which you should NOT do). > > Replication happens based on the version of the index on the master. > True, it starts out as a timestamp, but then successive versions > just have that number incremented. The version number > in the index on the slave is compared against the one on the master, > but the actual time (on the slave or master) is irrelevant. This is > explicitly to avoid problems with time synching across > machines/timezones/whataver.... > > It would be instructive to look at the admin/info page to see what > the index version is on the master and slave. > > But, if you optimize or commit (I think) on the _slave_, you might > change the timestamp and mess things up (although I'm reaching > here, I don't know this for certain). > > What's the index look like on the slave as compared to the master? > Are there just a bunch of files on the slave? Or a bunch of directories? > > Instead of re-indexing on the master, you could try to bring down the > slave, blow away the entire index and start it back up. Since this is a > production system, I'd only try this if I had more than one slave. Although > you could bring up a new slave and attach it to the master and see > what happens there. You wouldn't affect production if you didn't point > incoming requests at it... > > Best > Erick > > On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <alel...@gmail.com> > wrote: > > Erick, > > > > We're using Solr 3.3 on Linux (CentOS 5.6). > > The /data dir on master is actually 1.2G. > > > > I haven't tried to recreate the index yet. Since it's a production > > environment, > > I guess that I can stop replication and indexing and then recreate the > > master index to see if it makes any difference. > > > > Also just noticed another thread here named "Simple Slave Replication > > Question" that tells that it could be a problem if I'm seeing an > > /data/index with an timestamp on the slave node. > > Is this info relevant to this issue? > > > > Thanks, > > Alexandre > > > > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson < > erickerick...@gmail.com>wrote: > > > >> What version of Solr and what operating system? > >> > >> But regardless, this shouldn't be happening. Indexes can > >> temporarily double in size, but any extras should be > >> cleaned up relatively soon. > >> > >> On the master, what's the total size of the <solr home>/data directory? > >> I'm a little suspicious of the <backupAfter> on your master, but I > >> don't think that's the root of your problem.... > >> > >> Are you recreating the index on the master (by deleting the > >> index directory and starting over)? > >> > >> This is unusual, and I suspect it's something odd in your configuration, > >> but I confess I'm at a loss as to what. > >> > >> Best > >> Erick > >> > >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <alel...@gmail.com> > >> wrote: > >> > Hello, > >> > > >> > We have a Solr index that has an average of 1.19 GB in size. > >> > After configuring the replication, the slave machine is growing the > index > >> > size expoentially. > >> > Currently we have an slave with 323.44 GB in size. > >> > Is there anything that could cause this behavior? > >> > The current replication config is below. > >> > > >> > Master: > >> > <requestHandler name="/replication" class="solr.ReplicationHandler"> > >> > <lst name="master"> > >> > <str name="replicateAfter">commit</str> > >> > <str name="replicateAfter">startup</str> > >> > <str name="backupAfter">startup</str> > >> > <str name="confFiles"> > >> > > >> > elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt > >> > </str> > >> > </lst> > >> > </requestHandler> > >> > > >> > Slave: > >> > <requestHandler name="/replication" class="solr.ReplicationHandler"> > >> > <lst name="slave"> > >> > <str name="masterUrl">http://master:8984/solr/Index/replication</str> > >> > </lst> > >> > </requestHandler> > >> > > >> > Any pointers will be useful. > >> > > >> > Thanks, > >> > Alexandre > >> >