Re: Replication Handler Severe Error: Unable to move index file

Noble Paul നോബിള്‍ नोब्ळ् Sun, 24 Jan 2010 06:00:12 -0800

On Fri, Jan 22, 2010 at 4:24 AM, Trey <solrt...@gmail.com> wrote:
> Unfortunately, when I went back to look at the logs this morning, the log
> file had been blown away... that puts a major damper on my debugging
> capabilities - so sorry about that.  As a double whammy, we optimize
> nightly, so the old index files have completely changed at this point.
>
> I do not remember seeing an exception / stack trace in the logs associated
> with the "SEVERE *Unable to move file*" entry, but we were grepping the
> logs, so if it was outputted onto another line it could have possibly been
> there.  I wouldn't really expect to see anything based upon the code in
> SnapPuller.java:
>
> /**
>   * Copy a file by the File#renameTo() method. If it fails, it is
> considered a failure
>   * <p/>
>   * Todo may be we should try a simple copy if it fails
>   */
>  private boolean copyAFile(File tmpIdxDir, File indexDir, String fname,
> List<String> copiedfiles) {
>    File indexFileInTmpDir = new File(tmpIdxDir, fname);
>    File indexFileInIndex = new File(indexDir, fname);
>    boolean success = indexFileInTmpDir.renameTo(indexFileInIndex);
>    if (!success) {
>      LOG.error("Unable to move index file from: " + indexFileInTmpDir
>              + " to: " + indexFileInIndex);
>      for (String f : copiedfiles) {
>        File indexFile = new File(indexDir, f);
>        if (indexFile.exists())
>          indexFile.delete();
>      }
>      delTree(tmpIdxDir);
>      return false;
>    }
>    return true;
>  }
>
> In terms of whether this is an off case: this is the first occurrence of
> this I have seen in the logs.  We tried to replicate the conditions under
> which the exception occurred, but were unable.  I'll send along some more
> useful info if this happens again.
>
> In terms of the behavior we saw: It appears that a replication occurred and
> the "Unable to move file" error occurred.  As a result, it looks like the
> ENTIRE index was subsequently replicated again into a temporary directory
> (several times, over and over).
>
> The end result was that we had multiple full copies of the index in
> temporary index folders on the slave, and the original still couldn't be
> updated (the move to ./index wouldn't work).  Does Solr ever hold files open
> in a manner that would prevent a file in the index directory from being
> overridden?


There is a TODO which says manual it try to copy if move (renameTo)
fails. We never did it because we never observed renameTo failing.
>
>
> 2010/1/21 Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
>
>> is it a one off case? do you observerve this frequently?
>>
>> On Thu, Jan 21, 2010 at 11:26 AM, Otis Gospodnetic
>> <otis_gospodne...@yahoo.com> wrote:
>> > It's hard to tell without poking around, but one of the first things I'd
>> do would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm
>> - does this file/dir really exist?  Or, rather, did it exist when the error
>> happened.
>> >
>> > I'm not looking at the source code now, but is that really the only error
>> you got?  No exception stack trace?
>> >
>> >  Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Trey <solrt...@gmail.com>
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Wed, January 20, 2010 11:54:43 PM
>> >> Subject: Replication Handler Severe Error: Unable to move index file
>> >>
>> >> Does anyone know what would cause the following error?:
>> >>
>> >> 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile
>> >>
>> >>      SEVERE: *Unable to move index file* from:
>> >> /home/solr/cores/core8/index.20100119103919/_6qv.fnm to:
>> >> /home/solr/cores/core8/index/_6qv.fnm
>> >> This occurred a few days back and we noticed that several full copies of
>> the
>> >> index were subsequently pulled from the master to the slave, effectively
>> >> evicting our live index from RAM (the linux os cache), and killing our
>> query
>> >> performance due to disk io contention.
>> >>
>> >> Has anyone experienced this behavior recently?  I found an old thread
>> about
>> >> this error from early 2009, but it looks like it was patched almost a
>> year
>> >> ago:
>> >>
>> http://old.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-td21157722.html
>> >>
>> >>
>> >> Additional Relevant information:
>> >> -We are using the Solr 1.4 official release + a field collapsing patch
>> from
>> >> mid December (which I believe should only affect query side, not
>> indexing /
>> >> replication).
>> >> -Our Replication PollInterval for slaves checking the master is very
>> small
>> >> (15 seconds)
>> >> -We have a multi-box distributed search with each box possessing
>> multiple
>> >> cores
>> >> -We issue a manual (rolling) optimize across the cores on the master
>> once a
>> >> day (occurred ~ 1-2 hours before the above timeline)
>> >> -maxWarmingSearchers is set to 1.
>> >
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Systems Architect| AOL | http://aol.com
>>
>



-- 
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com

Re: Replication Handler Severe Error: Unable to move index file

Reply via email to