Re: Index "corruption" makes it return a different result

Michael McCandless Wed, 26 Mar 2008 12:47:18 -0700

OK.

I would recommend upgrading to 2.3.1. There were some corruptionissues with term vectors that could cause the wrong document's termvectors to come back.

That screen shot is spooky! Is it possible that one of the documentsyou indexed had that content? (It could simply be a stored field).


Mike

Lucas F. A. Teixeira wrote:

Thanks Michael!

Lucene version: 2.3.0
Here is some screenshot of editing the cfs file: http://img296.imageshack.us/my.php?image=indexow4.jpg
Take a look!

[]s,

Lucas

Michael McCandless wrote:
OK I think I follow now.

Which version of Lucene was this?
If it's not too large, can you post the CFS file that got mixedup? Be sure to cc me directly on the mail because the mailinglist software removes attachments.
Mike

Lucas F. A. Teixeira wrote:
This is just one of the index files.
As I said, the local disk where the index is generated, it's notfull, the full disk it's the shared storage where my applicationserver store its logs.When this disk hitted 100%, all the indexing process stop (ofcourse, all the processing instances of this managed serverstopped).
The "thing" is that the index was not corrupted, one of the indexfiles has these log messages from my application server insideit, problably a JVM problem on mixing two IO buffers when one ofthem coudn't flush (the logs partition). For me it would benormal if it causes index corruption... :-)
The second and most weird thing it's that my clients applicationjust read the index, and did some queries on it, always returningdifferent (but consistent) results.
I tried to edit the index file, and remove the application serverlogs that was inside it, and after that?? Index CorruptedException! :-)
Wow!
I think this issue involves more stuff than just lucene... I hadsome problems in my JVM IO buffer handling of course. But my point(s) is the both above... ;-)
[]s,

Lucas




Michael McCandless wrote:
I couldn't quite follow the part about "_al1.cfs".
It sounds like your indexing process hit a disk full event, thatled to this error? Can you post the full exception(s) from thedisk full?
Which version of Lucene are you using?

Mike

Lucas F. A. Teixeira wrote:
Hello all!

I had a problem this week, and I like to share with you all.
My weblogic server that generate my index hrows its logs in ashared storage. During my indexing process (SOLR+Lucene), thisshared storage became 100% full, and everything collapsed (allservers that uses this shared storage). But my index (that isgenerated in the local filesystem, just "grabbed" some logs ofthe server (who knows weblogic knows the managed serveraccesslog, that's the guy) from the buffer (my supposition),and put inside my index files! Take a look how my "_al1.cfs"became between some binary parts of the file:
2008-03-19 - 02:31:03 - [ip] - POST -200 - /AcomProductSyncServiceWeb/AcomProductSyncService2008-03-19 - 02:31:03 - [ip] - POST -200 - /AcomProductSyncServiceWeb/AcomProductSyncService2008-03-19 - 02:31:04 - [ip] - POST -200 - /AcomProductSyncServiceWeb/AcomProductSyncService2008-03-19 - 02:31:04 - [ip] - POST -200 - /AcomProductSyncServiceWeb/AcomProductSyncService2008-03-19 - 02:31:04 - [ip] - POST -200 - /AcomProductSyncServiceWeb/AcomProductSyncService
The most incredible thing, is that I can open the index withouta CorruptedIndexException, normally. That's really bad for me,cause the application didn't warn about a corrupted index (ofcourse, it is not). I can open it with the Luke App, and withthis simple code snippet accessing directly the lucene indexwithout solr:
IndexReader indexReader = IndexReader.open(FSDirectory.getDirectory("C/index/index.2008-03-19"));IndexSearcher indexSearcher = new IndexSearcher(indexReader);TermQuery termQuery = new TermQuery(new Term("itemId", "680804"));
       Hits hits = indexSearcher.search(termQuery);
             Iterator itHits = hits.iterator();
       while (itHits.hasNext()) {
           Hit hit = (Hit) itHits.next();
           Document document = hit.getDocument();
String itemId = document.getField("itemId").stringValue();
           System.out.println("itemId="+itemId);
       }
             indexSearcher.close();
       indexReader.close();
Ok, ok. But, if it's opening, whats my real problem? Makingthis little search above, the Document that I got, was anotherone, with other information different from the original onethat I was looking for (the one with the itemId field =680804). The whole document was another document (but a validdocument, that I've indexed before). The itemId value that Igot, the one that was printed from that application above was578340. Wow!!
I can reproduce this error anytime with this code or with lukeon this corrupted index, but was terrible for me to find theexact point of this fault.
I've reindexed everything, it solves my problem. But I wants toknow if someone have any idea why this happened...
Thanks people!

[]s,

Lucas Teixeira
[EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Index "corruption" makes it return a different result

Reply via email to