I didn't see any attachments on this email? (I was expecting
the .trc file so I could look at the full infoStream output).
Mike
Marcelo Ochoa wrote:
Hi Michael:
First thanks a lot for your time.
See comments below.
Is there any way to capture & serialize the actual documents being
added (this way I can "replay" those docs to reproduce it)?
Documents are a column VARCHAR2 from all_source Oracle's System
view, in fact is a table as:
create table test_source_big as (select * from all_source);
Are you using threads? Is autoCommit true or false?
Oracle JVM uses by default a single Thread model, except that Lucene
is starting a parallel Thread. InfoStream information shows only one
Thread.
AutoCommit is false.
I am creating LuceneWritter with this code:
IndexWriter writer = null;
Parameters parameters = dir.getParameters();
int mergeFactor =
Integer.parseInt(parameters.getParameter("MergeFactor",
"" +
LogMergePolicy.DEFAULT_MERGE_FACTOR));
int maxBufferedDocs =
Integer.parseInt(parameters.getParameter
("MaxBufferedDocs",
"" +
IndexWriter.DEFAULT_MAX_BUFFERED_DOCS));
int maxMergeDocs =
Integer.parseInt(parameters.getParameter("MaxMergeDocs",
"" +
LogDocMergePolicy.DEFAULT_MAX_MERGE_DOCS));
int maxBufferedDeleteTerms =
Integer.parseInt(parameters.getParameter
("MaxBufferedDeleteTerms",
"" +
IndexWriter.DEFAULT_MAX_BUFFERED_DELETE_TERMS));
Analyzer analyzer = getAnalyzer(parameters);
boolean useCompountFileName =
"true".equalsIgnoreCase(parameters.getParameter
("UseCompoundFile",
"false"));
boolean autoTuneMemory =
"true".equalsIgnoreCase(parameters.getParameter
("AutoTuneMemory",
"true"));
writer =
new IndexWriter(dir, autoCommitEnable, analyzer,
createEnable);
if (autoTuneMemory) {
long memLimit =
((OracleRuntime.getJavaPoolSize()/100)*50)/(1024*1024);
logger.info(".getIndexWriterForDir - Memory limit for
indexing (Mb): "+memLimit);
writer.setRAMBufferSizeMB(memLimit);
} else
writer.setMaxBufferedDocs(maxBufferedDocs);
writer.setMaxMergeDocs(maxMergeDocs);
writer.setMaxBufferedDeleteTerms(maxBufferedDeleteTerms);
writer.setMergeFactor(mergeFactor);
writer.setUseCompoundFile(useCompountFileName);
if (logger.isLoggable(Level.FINE))
writer.setInfoStream(System.out);
The example pass these relevant parameters:
AutoTuneMemory:true;LogLevel:FINE;Analyzer:org.apache.lucene.analysis.
StopAnalyzer;MergeFactor:500
So, because AutoTuneMemory is true, instead of setting
MaxBufferedDocs I am setting RAMBufferSizeMB(53) which is calculated
using Oracle SGA free memory.
Are you using payloads?
No.
Were there any previous exceptions in this IndexWriter before
flushing
this segment? Could you post the full infoStream output?
There is no provious exception. Attached a .trc file generated by
Oracle 11g, it have infoStream information plus logging informartion
from Oracle-Lucene data cartridge.
<snip>
Could you apply the patch below & re-run? It will likely produce
insane amounts of output, but we only need the last section to see
which term is hitting the bug. If that term consistently hits
the bug
then we can focus on how/when it gets indexed...
I'll patch my lucene-2.3.1 source and send again the .trc file.
Also, I am comparing FSDirectory implementation (2.3.1) with my
OJVMDirectory implementation to see changes on how the API of
BufferedIndex[Input|Output].java is used, may be here is the problem.
For example latest implementation wait an IOException when open an
IndexInput and a file doesn't exists, my code throw a RuntimeException
wich works with Lucene 2.2.x but doesn't work with 2.3.1, this was the
first change to get Lucene-Oracle integration working.
Best regards. Marcelo.
--
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Do you Know DBPrism? Look @ DB Prism's Web Site
http://www.dbprism.com.ar/index.html
More info?
Chapter 17 of the book "Programming the Oracle Database using Java &
Web Services"
http://www.amazon.com/gp/product/1555583296/
Chapter 21 of the book "Professional XML Databases" - Wrox Press
http://www.amazon.com/gp/product/1861003587/
Chapter 8 of the book "Oracle & Open Source" - O'Reilly
http://www.oreilly.com/catalog/oracleopen/
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]