Hi,

  

1.We Upgraded Lucene 4.6 to 8+, After upgrading we are facing issue with Lucene 
Index Creation.

We are indexing in Multi-threading environment. When we create bulk indexes , 
Lucene Document is getting corrupted. (data is not getting updated correctly. 
Merging of different row data).

2. when we are trying to updateDocument method for single record. It is not 
reflecting in IndexReader until the count is 8.  Once the count exceeds, than 
records are visible for IndexReader. (creating 8 segment files.) is there any 
alternative for reducing these segment file creation.  

3. above two issues are resolved by forcemerge(1). But it is not feasible for 
our use case , because it takes 3X memory. We are creating indexes for huge 
data.

 

4. IndexWriter Config:
analyzer=com.datanomic.director.casemanagement.indexing.AnalyzerFactory$MA

ramBufferSizeMB=64.0

maxBufferedDocs=-1

mergedSegmentWarmer=null

delPolicy=com.datanomic.director.casemanagement.indexing.engines.TimedDeletionPolicy

commit=null

openMode=CREATE_OR_APPEND

similarity=org.apache.lucene.search.similarities.BM25Similarity

mergeScheduler=ConcurrentMergeScheduler: maxThreadCount=-1, maxMergeCount=-1, 
ioThrottle=true

codec=Lucene80

infoStream=org.apache.lucene.util.InfoStream$NoOutput

mergePolicy=[TieredMergePolicy: maxMergeAtOnce=10, maxMergeAtOnceExplicit=30, 
maxMergedSegmentMB=5120.0, floorSegmentMB=2.0, 
forceMergeDeletesPctAllowed=10.0, segmentsPerTier=10.0, 
maxCFSSegmentSizeMB=8.796093022207999E12, noCFSRatio=0.1, deletesPctAllowed=33.0

indexerThreadPool=org.apache.lucene.index.DocumentsWriterPerThreadPool@24348e05

readerPooling=true

perThreadHardLimitMB=1945

useCompoundFile=false

commitOnClose=true

indexSort=null

checkPendingFlushOnUpdate=true

softDeletesField=null

readerAttributes={}

writer=org.apache.lucene.index.IndexWriter@23a84a99

 

Please suggest some ideas alternate of forceMerge, dealing with 
indexwriter.commit for multithreading, committing  data while updating single 
record.

 

 

Thanks,

Jyothsna

 

Reply via email to