Hi Michael: First thanks a lot for your time. See comments below. > Is there any way to capture & serialize the actual documents being > added (this way I can "replay" those docs to reproduce it)? Documents are a column VARCHAR2 from all_source Oracle's System view, in fact is a table as: create table test_source_big as (select * from all_source); > > Are you using threads? Is autoCommit true or false? Oracle JVM uses by default a single Thread model, except that Lucene is starting a parallel Thread. InfoStream information shows only one Thread. AutoCommit is false. I am creating LuceneWritter with this code: IndexWriter writer = null; Parameters parameters = dir.getParameters(); int mergeFactor = Integer.parseInt(parameters.getParameter("MergeFactor", "" + LogMergePolicy.DEFAULT_MERGE_FACTOR)); int maxBufferedDocs = Integer.parseInt(parameters.getParameter("MaxBufferedDocs", "" + IndexWriter.DEFAULT_MAX_BUFFERED_DOCS)); int maxMergeDocs = Integer.parseInt(parameters.getParameter("MaxMergeDocs", "" + LogDocMergePolicy.DEFAULT_MAX_MERGE_DOCS)); int maxBufferedDeleteTerms = Integer.parseInt(parameters.getParameter("MaxBufferedDeleteTerms", "" +
IndexWriter.DEFAULT_MAX_BUFFERED_DELETE_TERMS)); Analyzer analyzer = getAnalyzer(parameters); boolean useCompountFileName = "true".equalsIgnoreCase(parameters.getParameter("UseCompoundFile", "false")); boolean autoTuneMemory = "true".equalsIgnoreCase(parameters.getParameter("AutoTuneMemory", "true")); writer = new IndexWriter(dir, autoCommitEnable, analyzer, createEnable); if (autoTuneMemory) { long memLimit = ((OracleRuntime.getJavaPoolSize()/100)*50)/(1024*1024); logger.info(".getIndexWriterForDir - Memory limit for indexing (Mb): "+memLimit); writer.setRAMBufferSizeMB(memLimit); } else writer.setMaxBufferedDocs(maxBufferedDocs); writer.setMaxMergeDocs(maxMergeDocs); writer.setMaxBufferedDeleteTerms(maxBufferedDeleteTerms); writer.setMergeFactor(mergeFactor); writer.setUseCompoundFile(useCompountFileName); if (logger.isLoggable(Level.FINE)) writer.setInfoStream(System.out); The example pass these relevant parameters: AutoTuneMemory:true;LogLevel:FINE;Analyzer:org.apache.lucene.analysis.StopAnalyzer;MergeFactor:500 So, because AutoTuneMemory is true, instead of setting MaxBufferedDocs I am setting RAMBufferSizeMB(53) which is calculated using Oracle SGA free memory. > > Are you using payloads? No. > > Were there any previous exceptions in this IndexWriter before flushing > this segment? Could you post the full infoStream output? There is no provious exception. Attached a .trc file generated by Oracle 11g, it have infoStream information plus logging informartion from Oracle-Lucene data cartridge. > <snip> > Could you apply the patch below & re-run? It will likely produce > insane amounts of output, but we only need the last section to see > which term is hitting the bug. If that term consistently hits the bug > then we can focus on how/when it gets indexed... I'll patch my lucene-2.3.1 source and send again the .trc file. Also, I am comparing FSDirectory implementation (2.3.1) with my OJVMDirectory implementation to see changes on how the API of BufferedIndex[Input|Output].java is used, may be here is the problem. For example latest implementation wait an IOException when open an IndexInput and a file doesn't exists, my code throw a RuntimeException wich works with Lucene 2.2.x but doesn't work with 2.3.1, this was the first change to get Lucene-Oracle integration working. Best regards. Marcelo. -- Marcelo F. Ochoa http://marceloochoa.blogspot.com/ http://marcelo.ochoa.googlepages.com/home ______________ Do you Know DBPrism? Look @ DB Prism's Web Site http://www.dbprism.com.ar/index.html More info? Chapter 17 of the book "Programming the Oracle Database using Java & Web Services" http://www.amazon.com/gp/product/1555583296/ Chapter 21 of the book "Professional XML Databases" - Wrox Press http://www.amazon.com/gp/product/1861003587/ Chapter 8 of the book "Oracle & Open Source" - O'Reilly http://www.oreilly.com/catalog/oracleopen/
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]