Hi Michael:
First thanks a lot for your time.
See comments below.
> Is there any way to capture & serialize the actual documents being
> added (this way I can "replay" those docs to reproduce it)?
Documents are a column VARCHAR2 from all_source Oracle's System
view, in fact is a table as:
create table test_source_big as (select * from all_source);
>
> Are you using threads? Is autoCommit true or false?
Oracle JVM uses by default a single Thread model, except that Lucene
is starting a parallel Thread. InfoStream information shows only one
Thread.
AutoCommit is false.
I am creating LuceneWritter with this code:
IndexWriter writer = null;
Parameters parameters = dir.getParameters();
int mergeFactor =
Integer.parseInt(parameters.getParameter("MergeFactor",
"" +
LogMergePolicy.DEFAULT_MERGE_FACTOR));
int maxBufferedDocs =
Integer.parseInt(parameters.getParameter("MaxBufferedDocs",
"" +
IndexWriter.DEFAULT_MAX_BUFFERED_DOCS));
int maxMergeDocs =
Integer.parseInt(parameters.getParameter("MaxMergeDocs",
"" +
LogDocMergePolicy.DEFAULT_MAX_MERGE_DOCS));
int maxBufferedDeleteTerms =
Integer.parseInt(parameters.getParameter("MaxBufferedDeleteTerms",
"" +
IndexWriter.DEFAULT_MAX_BUFFERED_DELETE_TERMS));
Analyzer analyzer = getAnalyzer(parameters);
boolean useCompountFileName =
"true".equalsIgnoreCase(parameters.getParameter("UseCompoundFile",
"false"));
boolean autoTuneMemory =
"true".equalsIgnoreCase(parameters.getParameter("AutoTuneMemory",
"true"));
writer =
new IndexWriter(dir, autoCommitEnable, analyzer, createEnable);
if (autoTuneMemory) {
long memLimit =
((OracleRuntime.getJavaPoolSize()/100)*50)/(1024*1024);
logger.info(".getIndexWriterForDir - Memory limit for
indexing (Mb): "+memLimit);
writer.setRAMBufferSizeMB(memLimit);
} else
writer.setMaxBufferedDocs(maxBufferedDocs);
writer.setMaxMergeDocs(maxMergeDocs);
writer.setMaxBufferedDeleteTerms(maxBufferedDeleteTerms);
writer.setMergeFactor(mergeFactor);
writer.setUseCompoundFile(useCompountFileName);
if (logger.isLoggable(Level.FINE))
writer.setInfoStream(System.out);
The example pass these relevant parameters:
AutoTuneMemory:true;LogLevel:FINE;Analyzer:org.apache.lucene.analysis.StopAnalyzer;MergeFactor:500
So, because AutoTuneMemory is true, instead of setting
MaxBufferedDocs I am setting RAMBufferSizeMB(53) which is calculated
using Oracle SGA free memory.
>
> Are you using payloads?
No.
>
> Were there any previous exceptions in this IndexWriter before flushing
> this segment? Could you post the full infoStream output?
There is no provious exception. Attached a .trc file generated by
Oracle 11g, it have infoStream information plus logging informartion
from Oracle-Lucene data cartridge.
>
<snip>
> Could you apply the patch below & re-run? It will likely produce
> insane amounts of output, but we only need the last section to see
> which term is hitting the bug. If that term consistently hits the bug
> then we can focus on how/when it gets indexed...
I'll patch my lucene-2.3.1 source and send again the .trc file.
Also, I am comparing FSDirectory implementation (2.3.1) with my
OJVMDirectory implementation to see changes on how the API of
BufferedIndex[Input|Output].java is used, may be here is the problem.
For example latest implementation wait an IOException when open an
IndexInput and a file doesn't exists, my code throw a RuntimeException
wich works with Lucene 2.2.x but doesn't work with 2.3.1, this was the
first change to get Lucene-Oracle integration working.
Best regards. Marcelo.
--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Do you Know DBPrism? Look @ DB Prism's Web Site
http://www.dbprism.com.ar/index.html
More info?
Chapter 17 of the book "Programming the Oracle Database using Java &
Web Services"
http://www.amazon.com/gp/product/1555583296/
Chapter 21 of the book "Professional XML Databases" - Wrox Press
http://www.amazon.com/gp/product/1861003587/
Chapter 8 of the book "Oracle & Open Source" - O'Reilly
http://www.oreilly.com/catalog/oracleopen/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]