Is there any way that Windows 7 and disk drivers are not honoring the fsync() calls? That would cause files and/or blocks to get saved out of order.
On Tue, Nov 30, 2010 at 3:24 PM, Peter Sturge <peter.stu...@gmail.com> wrote: > After a recent Windows 7 crash (:-\), upon restart, Solr starts giving > LockObtainFailedException errors: (excerpt) > > 30-Nov-2010 23:10:51 org.apache.solr.common.SolrException log > SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock > obtain timed out: > nativefsl...@solr\.\.\data0\index\lucene-ad25f73e3c87e6f192c4421756925f47-write.lock > > > When I run CheckIndex, I get: (excerpt) > > 30 of 30: name=_2fi docCount=857 > compound=false > hasProx=true > numFiles=8 > size (MB)=0.769 > diagnostics = {os.version=6.1, os=Windows 7, lucene.version=3.1-dev > ${svnver > sion} - 2010-09-11 11:09:06, source=flush, os.arch=amd64, > java.version=1.6.0_18, > java.vendor=Sun Microsystems Inc.} > no deletions > test: open reader.........FAILED > WARNING: fixIndex() would remove reference to this segment; full exception: > org.apache.lucene.index.CorruptIndexException: did not read all bytes from > file > "_2fi.fnm": read 1 vs size 512 > at org.apache.lucene.index.FieldInfos.read(FieldInfos.java:367) > at org.apache.lucene.index.FieldInfos.<init>(FieldInfos.java:71) > at > org.apache.lucene.index.SegmentReader$CoreReaders.<init>(SegmentReade > r.java:119) > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:583) > at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:561) > at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:467) > at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:878) > > WARNING: 1 broken segments (containing 857 documents) detected > > > This seems to happen every time Windows 7 crashes, and it would seem > extraordinary bad luck for this tiny test index to be in the middle of > a commit every time. > (it is set to commit every 40secs, but for such a small index it only > takes millis to complete) > > Does this seem right? I don't remember seeing so many corruptions in > the index - maybe it is the world of Win7 dodgy drivers, but it would > be worth investigating if there's something amiss in Solr/Lucene when > things go down unexpectedly... > > Thanks, > Peter > > > On Tue, Nov 30, 2010 at 9:19 AM, Peter Sturge <peter.stu...@gmail.com> wrote: >> The index itself isn't corrupt - just one of the segment files. This >> means you can read the index (less the offending segment(s)), but once >> this happens it's no longer possible to >> access the documents that were in that segment (they're gone forever), >> nor write/commit to the index (depending on the env/request, you get >> 'Error reading from index file..' and/or WriteLockError) >> (note that for my use case, documents are dynamically created so can't >> be re-indexed). >> >> Restarting Solr fixes the write lock errors (an indirect environmental >> symptom of the problem), and running CheckIndex -fix is the only way >> I've found to repair the index so it can be written to (rewrites the >> corrupted segment(s)). >> >> I guess I was wondering if there's a mechanism that would support >> something akin to a transactional rollback for segments. >> >> Thanks, >> Peter >> >> >> >> On Mon, Nov 29, 2010 at 5:33 PM, Yonik Seeley >> <yo...@lucidimagination.com> wrote: >>> On Mon, Nov 29, 2010 at 10:46 AM, Peter Sturge <peter.stu...@gmail.com> >>> wrote: >>>> If a Solr index is running at the time of a system halt, this can >>>> often corrupt a segments file, requiring the index to be -fix'ed by >>>> rewriting the offending file. >>> >>> Really? That shouldn't be possible (if you mean the index is truly >>> corrupt - i.e. you can't open it). >>> >>> -Yonik >>> http://www.lucidimagination.com >>> >> > -- Lance Norskog goks...@gmail.com