RE: replication, disk space

Dyer, James Thu, 19 Jan 2012 10:25:37 -0800

You can do all the steps to rename the timestamp dir back to "index", but I 
don't think you don't have to.  Solr will know on restart to use the 
timestamped directory so long as it is in the properties file (sorry, I must 
have told you to look at the wrong file...I'm working on old memories here.)  
You might want to test this in your dev enviornment but I think its going to 
work.  The only thing is if it really bothers you that the index isn't being 
stored in "index"...


The reason why you get into this situation with the timestamped directory is 
explained here:  
http://wiki.apache.org/solr/SolrReplication#What_if_I_add_documents_to_the_slave_or_if_slave_index_gets_corrupted.3F

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Jonathan Rochkind [mailto:rochk...@jhu.edu] 
Sent: Thursday, January 19, 2012 11:43 AM
To: solr-user@lucene.apache.org
Cc: Dyer, James
Subject: Re: replication, disk space

Okay, I do have an index.properties file too, and THAT one does contain 
the name of an index directory.

But it's got the name of the timestamped index directory!  Not sure how 
that happened, could have been Solr trying to recover from running out 
of disk space in the middle of a replication? I certainly never did that 
intentionally.

But okay, if someone can confirm if this plan makes sense to restore 
things without downtime:

1. rm the 'index' directory, which seems to be an old copy of the index 
at this point
2. 'mv index.20120113121302 index'
3. Manually edit index.properties to have index=index, not 
index=index.20120113121302
4. Send reload core command.

Does this make sense?  (I just experimentally tried an reload core 
command, and even though it's not supposed to, it DID result in about 20 
seconds of unresponsiveness from my solr server, not sure why, could 
just be lack of CPU or RAM on the server to do what's being asked of it. 
But if that's the best I can do, 20 minutes of unavailability, I'll take 
it).

On 1/19/2012 12:37 PM, Jonathan Rochkind wrote:
> Hmm, I don't have a "replication.properties" file, I don't think. Oh 
> wait, yes I do there it is!  I guess the replication process makes 
> this file?
>
> Okay....
>
> I don't see an index directory in the replication.properties file at 
> all though. Below is my complete replication.properties.
>
> So I'm still not sure how to properly recover from this situation 
> withotu downtime. It _looks_ to me like the timestamped directory is 
> actually the live/recent one.  It's files have a more recent 
> timestamp, and it's the one that /admin/replication.jsp mentions.
>
> replication.properties:
>
> #Replication details
> #Wed Jan 18 10:58:25 EST 2012
> confFilesReplicated=[solrconfig.xml, schema.xml]
> timesIndexReplicated=350
> lastCycleBytesDownloaded=6524299012
> replicationFailedAtList=1326902305288,1326406990614,1326394654410,1326218508294,1322150197956,1321987735253,1316104240679,1314371534794,1306764945741,1306678853902
>  
>
> replicationFailedAt=1326902305288
> timesConfigReplicated=1
> indexReplicatedAtList=1326902305288,1326825419865,1326744428192,1326645554344,1326569088373,1326475488777,1326406990614,1326394654410,1326303313747,1326218508294
>  
>
> confFilesReplicatedAt=1316547200637
> previousCycleTimeInSeconds=295
> timesFailed=54
> indexReplicatedAt=1326902305288
> ~
>
>
> On 1/18/2012 1:41 PM, Dyer, James wrote:
>> I've seen this happen when the configuration files change on the 
>> master and replication deems it necessary to do a core-reload on the 
>> slave. In this case, replication copies the entire index to the new 
>> directory then does a core re-load to make the new config files and 
>> new index directory go live.  Because it is keeping the old searcher 
>> running while the new searcher is being started, both index copies to 
>> exist until the swap is complete.  I remember having the same concern 
>> about re-starts, but I believe I tested this and solr will look at 
>> the "replication.properties" file on startup and determine the 
>> correct index dir to use from that.  So (If my memory is correct) you 
>> can safely delete "index" so long as "replication.properties" points 
>> to the other directory.
>>
>> I wasn't familiar with SOLR-1781.  Maybe replication is supposed to 
>> clean up the extra directories and doesn't sometimes?  In any case, 
>> I've found whenever it happens its ok to go out and delete the one(s) 
>> not being used, even if that means deleting "index".
>>
>> James Dyer
>> E-Commerce Systems
>> Ingram Content Group
>> (615) 213-4311
>>
>> -----Original Message-----
>> From: Artem Lokotosh [mailto:arco...@gmail.com]
>> Sent: Wednesday, January 18, 2012 12:24 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: replication, disk space
>>
>> Which OS do you using?
>> Maybe related to this Solr bug
>> https://issues.apache.org/jira/browse/SOLR-1781
>>
>> On Wed, Jan 18, 2012 at 6:32 PM, Jonathan Rochkind<rochk...@jhu.edu>  
>> wrote:
>>> So Solr 1.4. I have a solr master/slave, where it actually doesn't 
>>> poll for
>>> replication, it only replicates irregularly when I issue a replicate 
>>> command
>>> to it.
>>>
>>> After the last replication, the slave, in solr_home, has a data/index
>>> directory as well as a data/index.20120113121302 directory.
>>>
>>> The /admin/replication/index.jsp admin page reports:
>>>
>>> Local Index
>>> Index Version: 1326407139862, Generation: 183
>>> Location: /opt/solr/solr_searcher/prod/data/index.20120113121302
>>>
>>>
>>> So does this mean the index.XXXX file is actually the one currently 
>>> being
>>> used live, not the straight 'index'? Why?
>>>
>>> I can't afford the disk space to leave both of these around 
>>> indefinitely.
>>>   After replication completes and is committed, why would two index 
>>> dirs be
>>> left?  And how can I restore this to one index dir, without 
>>> downtime? If
>>> it's really using the "index.XXXXX" directory, then I could just 
>>> delete the
>>> "index" directory, but that's a bad idea, because next time the server
>>> starts it's going to be looking for "index", not "index.XXXX".  And 
>>> if it's
>>> using the timestamped index file now, I can't delete THAT one now 
>>> either.
>>>
>>> If I was willing to restart the tomcat container, then I could 
>>> delete one,
>>> rename the other, etc. But I don't want downtime.
>>>
>>> I really don't understand what's going on or how it got in this 
>>> state. Any
>>> ideas?
>>>
>>> Jonathan
>>>
>>
>>

RE: replication, disk space

Reply via email to