Re: segment gets corrupted (after background merge ?)

2011-01-18 Thread Stéphane Delprat
I ran other tests : when I execute the checkIndex on the master I got random errors, but when I scp the file on another server (same software exactly) no error occurs... We will start using another server. Just one question concerning checkIndex : What does tokens mean ? How is it possible

Re: segment gets corrupted (after background merge ?)

2011-01-18 Thread Michael McCandless
OK thanks for bringing closure! The tokens output is the total number of indexed tokens (ie, as if you had a counter that counted all tokens produced by analysis as the indexer consumes them). My guess is the faulty server's hardware problem also messed up this count? Mike On Tue, Jan 18, 2011

Re: segment gets corrupted (after background merge ?)

2011-01-14 Thread Michael McCandless
Right, but removing a segment out from under a live IW (when you run CheckIndex with -fix) is deadly, because that other IW doesn't know you've removed the segment, and will later commit a new segment infos still referencing that segment. The nature of this particular exception from CheckIndex is

Re: segment gets corrupted (after background merge ?)

2011-01-14 Thread Stéphane Delprat
So I ran checkIndex (without -fix) 5 times in a row : SOLR was running, but no client connected to it. (just the slave which was synchronizing every 5 minutes) summary : 1: all good 2: 2 errors: (seg 1 2) terms, freq, prox...ERROR [term blog_id:104150: doc 324697 = lastDoc 324697] terms,

Re: segment gets corrupted (after background merge ?)

2011-01-14 Thread Michael McCandless
OK given that you're seeing non-deterministic results on the same index... I think this is likely a hardware issue or a JRE bug? If you move that index over to another env and run CheckIndex, is it consistent? Mike On Fri, Jan 14, 2011 at 9:00 AM, Stéphane Delprat

Re: segment gets corrupted (after background merge ?)

2011-01-13 Thread Stéphane Delprat
I understand less and less what is happening to my solr. I did a checkIndex (without -fix) and there was an error... So a did another checkIndex with -fix and then the error was gone. The segment was alright During checkIndex I do not shut down the solr server, I just make sure no client

Re: segment gets corrupted (after background merge ?)

2011-01-13 Thread Michael McCandless
Generally it's not safe to run CheckIndex if a writer is also open on the index. It's not safe because CheckIndex could hit FNFE's on opening files, or, if you use -fix, CheckIndex will change the index out from under your other IndexWriter (which will then cause other kinds of corruption). That

Re: segment gets corrupted (after background merge ?)

2011-01-13 Thread Lance Norskog
1) CheckIndex is not supposed to change a corrupt segment, only remove it. 2) Are you using local hard disks, or do run on a common SAN or remote file server? I have seen corruption errors on SANs, where existing files have random changes. On Thu, Jan 13, 2011 at 11:06 AM, Michael McCandless

Re: segment gets corrupted (after background merge ?)

2011-01-12 Thread Stéphane Delprat
I got another corruption. It sure looks like it's the same type of error. (on a different field) It's also not linked to a merge, since the segment size did not change. *** good segment : 1 of 9: name=_ncc docCount=1841685 compound=false hasProx=true numFiles=9 size

Re: segment gets corrupted (after background merge ?)

2011-01-12 Thread Michael McCandless
Curious... is it always a docFreq=1 != num docs seen 0 + num docs deleted 0? It looks like new deletions were flushed against the segment (del file changed from _ncc_22s.del to _ncc_24f.del). Are you hitting any exceptions during indexing? Mike On Wed, Jan 12, 2011 at 10:33 AM, Stéphane

Re: segment gets corrupted (after background merge ?)

2011-01-11 Thread Michael McCandless
When you hit corruption is it always this same problem?: java.lang.RuntimeException: term source:margolisphil docFreq=1 != num docs seen 0 + num docs deleted 0 Can you run with Lucene's IndexWriter infoStream turned on, and catch the output leading to the corruption? If something is somehow