Hi Michael
Glad you asked and thanks in advance for your help! I am trying to
reindex 325 emails. Its a pretty straight forward use of Lucene. I
create an IndexWriter, write a whole bunch of documents and then close
the Index every 2 seconds. See the attached source code. In case, you
are wondering there is only ever one VolumeIndex ever created.
Following, there is only ever one IndexWriter created in the entire
application. After further testing I dont think the problem is related
to IndexReader since the problem still occurs even if I dont execute a
search. The amount emails indexed varies between around 305 and 325. The
exception that gets thrown prevents the emails from being indexed. I've
tried removing the synchronization and doing a simple:
synchronized(this) {
openIndex()
write()
closeIndex()
}
i.e. open and close the index between every document write and the
problem still appears.
For a recap, here is the exception:
java.io.IOException: Cannot overwrite: C:\index9121\_1.cfs
at
org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:433)
at
org.apache.lucene.index.CompoundFileWriter.close(CompoundFileWriter.java:150)
at
org.apache.lucene.index.DocumentsWriter.createCompoundFile(DocumentsWriter.java:587)
at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3251)
at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3110)
at
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1659)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1633)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1601)
I cleared out the logs and executed another reindex and the following
debug information was outputted by Lucene. Unfortunately, the debug
output does not include any information on the error. I am not sure why
this is the case, but I can assure you that the log information was
generated by the reindexing operation. Any ideas on what this might be?
Can you suggest a workaround?
As an aside note, is Lucene 2.2.0 compatible with Lucene 2.3.0 indexes?
If I cant sort this out in the next couple of days, I may need to switch
everyone back to Lucene 2.2.0 temporarily until this problem is resolved.
IFD [http-8090-2]: setInfoStream
[EMAIL PROTECTED]
IW 22 [http-8090-2]: setInfoStream:
[EMAIL PROTECTED]:\index9121 autoCommit=false
[EMAIL PROTECTED]
[EMAIL PROTECTED]
ramBufferSizeMB=16.0 maxBufferedDocs=-1 maxBuffereDeleteTerms=-1
maxFieldLength=50000 index=_0:C54 _1:C82 _2:C14 _3:c36 _4:c6 _5:c52 _6:c53
IW 22 [Timer-0]: now flush at close
IW 22 [Timer-0]: flush: segment=_7 docStoreSegment=_7 docStoreOffset=0
flushDocs=true flushDeletes=true flushDocStores=true numDocs=28
numBufDelTerms=0
IW 22 [Timer-0]: index before flush _0:C54 _1:C82 _2:C14 _3:c36 _4:c6
_5:c52 _6:c53
IW 22 [Timer-0]: DW: flush postings as segment _7 numDocs=28
IW 22 [Timer-0]: DW: closeDocStore: 2 files to flush to segment _7
numDocs=28
IW 22 [Timer-0]: DW: oldRAMSize=245760 newFlushedSize=38119
docs/MB=770.223 new/old=15.511%
IFD [Timer-0]: now checkpoint "segments_8" [8 segments ; isCommit = false]
IFD [Timer-0]: delete pending file _f.cfs
IFD [Timer-0]: delete "_f.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_f.cfs":
java.io.IOException: Cannot delete C:\index9121\_f.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _h.cfs
IFD [Timer-0]: delete "_h.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_h.cfs":
java.io.IOException: Cannot delete C:\index9121\_h.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _d.cfs
IFD [Timer-0]: delete "_d.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_d.cfs":
java.io.IOException: Cannot delete C:\index9121\_d.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _c.cfs
IFD [Timer-0]: delete "_c.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_c.cfs":
java.io.IOException: Cannot delete C:\index9121\_c.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _b.cfs
IFD [Timer-0]: delete "_b.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_b.cfs":
java.io.IOException: Cannot delete C:\index9121\_b.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _g.cfs
IFD [Timer-0]: delete "_g.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_g.cfs":
java.io.IOException: Cannot delete C:\index9121\_g.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _e.cfs
IFD [Timer-0]: delete "_e.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_e.cfs":
java.io.IOException: Cannot delete C:\index9121\_e.cfs; Will re-try later.
IFD [Timer-0]: now checkpoint "segments_8" [8 segments ; isCommit = false]
IFD [Timer-0]: delete pending file _f.cfs
IFD [Timer-0]: delete "_f.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_f.cfs":
java.io.IOException: Cannot delete C:\index9121\_f.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _h.cfs
IFD [Timer-0]: delete "_h.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_h.cfs":
java.io.IOException: Cannot delete C:\index9121\_h.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _d.cfs
IFD [Timer-0]: delete "_d.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_d.cfs":
java.io.IOException: Cannot delete C:\index9121\_d.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _c.cfs
IFD [Timer-0]: delete "_c.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_c.cfs":
java.io.IOException: Cannot delete C:\index9121\_c.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _b.cfs
IFD [Timer-0]: delete "_b.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_b.cfs":
java.io.IOException: Cannot delete C:\index9121\_b.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _g.cfs
IFD [Timer-0]: delete "_g.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_g.cfs":
java.io.IOException: Cannot delete C:\index9121\_g.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _e.cfs
IFD [Timer-0]: delete "_e.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_e.cfs":
java.io.IOException: Cannot delete C:\index9121\_e.cfs; Will re-try later.
IFD [Timer-0]: delete "_7.fnm"
IFD [Timer-0]: delete "_7.frq"
IFD [Timer-0]: delete "_7.prx"
IFD [Timer-0]: delete "_7.tis"
IFD [Timer-0]: delete "_7.tii"
IFD [Timer-0]: delete "_7.nrm"
IFD [Timer-0]: delete "_7.fdx"
IFD [Timer-0]: delete "_7.fdt"
IW 22 [Timer-0]: LMP: findMerges: 8 segments
IW 22 [Timer-0]: LMP: level -1.0 to 5.202747: 8 segments
IW 22 [Timer-0]: CMS: now merge
IW 22 [Timer-0]: CMS: index: _0:C54 _1:C82 _2:C14 _3:c36 _4:c6 _5:c52
_6:c53 _7:c28
IW 22 [Timer-0]: CMS: no more merges pending; now return
IW 22 [Timer-0]: now call final commit()
IW 22 [Timer-0]: start commit() skipWait=true sizeInBytes=0
IW 22 [Timer-0]: commit index=_0:C54 _1:C82 _2:C14 _3:c36 _4:c6 _5:c52
_6:c53 _7:c28
IW 22 [Timer-0]: now sync _7.cfs
IW 22 [Timer-0]: commit complete
IFD [Timer-0]: now checkpoint "segments_9" [8 segments ; isCommit = true]
IFD [Timer-0]: delete pending file _f.cfs
IFD [Timer-0]: delete "_f.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_f.cfs":
java.io.IOException: Cannot delete C:\index9121\_f.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _h.cfs
IFD [Timer-0]: delete "_h.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_h.cfs":
java.io.IOException: Cannot delete C:\index9121\_h.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _d.cfs
IFD [Timer-0]: delete "_d.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_d.cfs":
java.io.IOException: Cannot delete C:\index9121\_d.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _c.cfs
IFD [Timer-0]: delete "_c.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_c.cfs":
java.io.IOException: Cannot delete C:\index9121\_c.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _b.cfs
IFD [Timer-0]: delete "_b.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_b.cfs":
java.io.IOException: Cannot delete C:\index9121\_b.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _g.cfs
IFD [Timer-0]: delete "_g.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_g.cfs":
java.io.IOException: Cannot delete C:\index9121\_g.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _e.cfs
IFD [Timer-0]: delete "_e.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_e.cfs":
java.io.IOException: Cannot delete C:\index9121\_e.cfs; Will re-try later.
IFD [Timer-0]: deleteCommits: now decRef commit "segments_8"
IFD [Timer-0]: delete "segments_8"
IW 22 [Timer-0]: done all syncs
IW 22 [Timer-0]: at close: _0:C54 _1:C82 _2:C14 _3:c36 _4:c6 _5:c52
_6:c53 _7:c28
IFD [Timer-0]: delete pending file _f.cfs
IFD [Timer-0]: delete "_f.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_f.cfs":
java.io.IOException: Cannot delete C:\index9121\_f.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _h.cfs
IFD [Timer-0]: delete "_h.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_h.cfs":
java.io.IOException: Cannot delete C:\index9121\_h.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _d.cfs
IFD [Timer-0]: delete "_d.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_d.cfs":
java.io.IOException: Cannot delete C:\index9121\_d.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _c.cfs
IFD [Timer-0]: delete "_c.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_c.cfs":
java.io.IOException: Cannot delete C:\index9121\_c.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _b.cfs
IFD [Timer-0]: delete "_b.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_b.cfs":
java.io.IOException: Cannot delete C:\index9121\_b.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _g.cfs
IFD [Timer-0]: delete "_g.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_g.cfs":
java.io.IOException: Cannot delete C:\index9121\_g.cfs; Will re-try later.
IFD [Timer-0]: delete pending file _e.cfs
IFD [Timer-0]: delete "_e.cfs"
IFD [Timer-0]: IndexFileDeleter: unable to remove file "_e.cfs":
java.io.IOException: Cannot delete C:\index9121\_e.cfs; Will re-try later.
Michael McCandless wrote:
That use case of Lucene should be fine, ie no further synchronization
should be necessary.
Your debug output below is great, but it doesn't seem to cover the
occurrence of that exception. Can you post the full debug output?
Can you describe how you reindex in more detail? Are you manually
removing files from the index and then opening a new IndexWriter with
create=true?
Mike
package com.stimulus.archiva.index;
import java.io.File;
import java.io.IOException;
import java.io.PrintStream;
import java.util.Date;
import java.util.Timer;
import java.util.TimerTask;
import javax.mail.MessagingException;
import org.apache.log4j.Level;
import org.apache.log4j.Logger;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import com.stimulus.archiva.domain.Config;
import com.stimulus.archiva.domain.Email;
import com.stimulus.archiva.domain.Index;
import com.stimulus.archiva.domain.Indexer;
import com.stimulus.archiva.domain.Volume;
import com.stimulus.archiva.exception.ExtractionException;
import com.stimulus.archiva.exception.MessageSearchException;
import com.stimulus.archiva.language.AnalyzerFactory;
import com.stimulus.archiva.search.ArchivaAnalyzer;
public class VolumeIndex {
protected static final Logger logger =
Logger.getLogger(VolumeIndex.class.getName());
public static final int indexOpenTime = 2000;
IndexWriter writer = null;
Volume volume;
Timer closeIndexTimer = new Timer();
Object indexLock = new Object();
ArchivaAnalyzer analyzer = new ArchivaAnalyzer();
Indexer indexer = null;
public VolumeIndex(Indexer indexer, Volume volume) {
this.volume = volume;
this.indexer = indexer;
closeIndexTimer.scheduleAtFixedRate(new
TimerTask() {
public void run() {
closeIndex();
}
}, indexOpenTime, indexOpenTime);
}
protected void openIndex() throws MessageSearchException {
Exception lastError = null;
synchronized(indexLock) {
if (writer==null) {
logger.debug("openIndex() index will be opened.
it is currently closed.");
} else {
logger.debug("openIndex() did not bother
opening index. it is already open.");
return;
}
if (volume == null)
throw new MessageSearchException("assertion
failure: null volume",logger);
logger.debug("opening index for write
{"+volume+"}");
indexer.prepareIndex(volume);
Index activeIndex = volume.getActiveIndex();
logger.debug("opening search index for write
{indexpath='"+activeIndex.getPath()+"'}");
int attempt = 0;
do {
try {
writer = new
IndexWriter(FSDirectory.getDirectory(activeIndex.getPath()),false, analyzer);
writer.setMaxFieldLength(50000);
if
(logger.isDebugEnabled()) {
File file = new
File(Config.getFileSystem().getLogPath()+File.separator+"index.log");
PrintStream
debugout = new PrintStream(file);
writer.setInfoStream(debugout);
}
break;
} catch (IOException io)
{
io.printStackTrace();
if (attempt==9) {
// most obvious reason
for error is that there is a lock on the index, due hard shutdown
// resolution delete the
lock, and try again
logger.warn("failed to open
search index for write. possible write lock due to hard system shutdown.",io);
logger.info("attempting
recovery. deleting index lock file and retrying..");
File lockFile = new
File(activeIndex.getPath()+File.separatorChar + "write.lock");
lockFile.delete();
}
attempt++;
try { Thread.sleep(10); } catch
(Exception e) {}
lastError = io;
}
} while (attempt<10);
if (attempt>=10)
throw new MessageSearchException("failed to
open/ index writer {location='"+activeIndex.getPath()+"'}",lastError,logger);
}
}
public void indexMessage(Email message) throws
MessageSearchException {
long s = (new Date()).getTime();
if (message == null)
throw new MessageSearchException("assertion
failure: null message",logger);
logger.debug("indexing message {"+message+"}");
Document doc = new Document();
try {
DocumentIndex docIndex = new DocumentIndex(indexer);
docIndex.write(message,doc);
String language = doc.get("lang");
if (language==null)
language = indexer.getIndexLanguage();
synchronized (indexLock) {
openIndex();
writer.addDocument(doc,AnalyzerFactory.getAnalyzer(language,AnalyzerFactory.Operation.INDEX));
}
logger.debug("message indexed successfully
{"+message+",language='"+language+"'}");
} catch (MessagingException me)
{
throw new MessageSearchException("failed to decode
message during indexing",me,logger, Level.DEBUG);
} catch (IOException me) {
throw new MessageSearchException("failed to index
message {"+message+"}",me,logger, Level.DEBUG);
} catch (ExtractionException ee)
{
// we will want to continue indexng
//throw new MessageSearchException("failed to decode
attachments in message {"+message+"}",ee,logger, Level.DEBUG);
} catch (Exception e) {
throw new MessageSearchException("failed to index
message",e,logger, Level.DEBUG);
}
logger.debug("indexing message end {"+message+"}");
long e = (new Date()).getTime();
logger.debug("indexing time {time='"+(e-s)+"'}");
}
protected void closeIndex() {
synchronized(indexLock) {
if (writer==null)
return;
int attempt = 0;
do {
try {
writer.close();
logger.debug("writer
closed");
break;
} catch (Exception io) {
io.printStackTrace();
logger.error("failed to
close index writer:"+io.getMessage(),io);
attempt++;
try { Thread.sleep(10);
} catch (Exception e) {}
continue;
}
} while (attempt<10);
writer = null;
}
}
// deliberately non recursive (so we avoid situations
where the whole h/d is deleted)
public void deleteIndex() throws MessageSearchException {
logger.debug("delete index
{indexpath='"+volume.getIndexPath()+"'}");
synchronized(indexLock) {
Config.getConfig().getServices().stop(Indexer.class);
File indexDir = new File(volume.getIndexPath());
if (!indexDir.exists()) return;
if (indexDir.isDirectory()) {
String[] children = indexDir.list();
for (int i=0; i<children.length; i++) {
String filepath =
volume.getIndexPath()+File.separatorChar+children[i];
logger.debug("deleting file {path='" +
filepath +"'}");
File file = new File(filepath);
boolean success = file.delete();
if (!success) {
try {
File newFile =
File.createTempFile("temp","idx");
file.renameTo(newFile);
} catch (Exception e) {
logger.error("failed to
delete file in existing index {filepath='"+filepath+"'}");
//throw new
MessageSearchException("failed to delete file in existing index
{filepath='"+filepath+"'}",logger);
}
} else
logger.debug("deleted file
successfully {filepath='" + filepath +"'}");
}
}
}
}
protected void finalize() throws Throwable {
logger.debug("volumeindex class is shutting down");
try {
closeIndexTimer.cancel();
closeIndex();
} finally {
super.finalize();
}
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]