Ravi, as far as I remember, this is how the replication logic works (see SnapPuller class, fetchLatestIndex method):
> 1. Does the Slave get the whole index every time during replication or > just the delta since the last replication happened ? It look at the index version AND the index generation. If both slave's version and generation are the same as on master, nothing gets replicated. if the master's generation is greater than on slave, the slave fetches the delta files only (even if the partial merge was done on the master) and put the new files from master to the same index folder on slave (either index or index.<timestamp>, see further explanation). However, if the master's index generation is equals or less than one on slave, the slave does the full replication by fetching all files of the master's index and place them into a separate folder on slave (index.<timestamp>). Then, if the fetch is successfull, the slave updates (or creates) the index.properties file and puts there the name of the "current" index folder. The "old" index.<timestamp> folder(s) will be kept in 1.4.x - which was treated as a bug - see SOLR-2156 (and this was fixed in 3.1). After this, the slave does commit or reload core depending whether the config files were replicated. There is another bug in 1.4.x that fails replication if the slave need to do the full replication AND the config files were changed - also fixed in 3.1 (see SOLR-1983). > 2. If there are huge number of queries being done on slave will it > affect the replication ? How can I improve the performance ? (see the > replications details at he bottom of the page) >From my experience the half of the replication time is a time when the transferred data flushes to the disk. So the IO impact is important. > 3. Will the segment names be same be same on master and slave after > replication ? I see that they are different. Is this correct ? If it > is correct how does the slave know what to fetch the next time i.e. > the delta. They should be the same. The slave fetches the changed files only (see above), also look at SnapPuller code. > 4. When and why does the index.<TIMESTAMP> folder get created ? I see > this type of folder getting created only on slave and the slave > instance is pointing to it. See above. > 5. Does replication process copy both the index and index.<TIMESTAMP> folder ? index.<timestamp> folder gets created only of the full replication happened at least once. Otherwise, the slave will use the index folder. > 6. what happens if the replication kicks off even before the previous > invocation has not completed ? will the 2nd invocation block or will > it go through causing more confusion ? There is a lock (snapPullLock in ReplicationHandler) that prevents two replications run simultaneously. If there is no bug, it should just return silently from the replication call. (I personally never had problem with this so it looks there is no bug :) > 7. If I have to prep a new master-slave combination is it OK to copy > the respective contents into the new master-slave and start solr ? or > do I have have to wipe the new slave and let it replicate from its new > master ? If the new master has a different index, the slave will create a new <index.timestamp> folder. There is no need to wipe it. > 8. Doing an 'ls | wc -l' on index folder of master and slave gave 194 > and 17968 respectively...I slave has lot of segments_xxx files. Is > this normal ? No, it looks like in your case the slave continues to replicate to the same folder for a long time period but the old files are not getting deleted bu some reason. Try to restart the slave or do core reload on it to see if the old segments gone. -Alexander