[ https://issues.apache.org/jira/browse/LUCENE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769946#action_12769946 ]
Uwe Schindler edited comment on LUCENE-1960 at 10/26/09 9:05 AM: ----------------------------------------------------------------- I do not know if this is a bug in 2.9.0, but it seems that segments with all documents deleted are not automatically removed: {noformat} 4 of 14: name=_dlo docCount=5 compound=true hasProx=true numFiles=2 size (MB)=0.059 diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 2009-09-21 10:25:09, os=SunOS, os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10, source=flush} has deletions [delFileName=_dlo_1.del] test: open reader.........OK [5 deleted docs] test: fields..............OK [136 fields] test: field norms.........OK [136 fields] test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens] test: stored fields.......OK [0 total field count; avg ? fields per doc] test: term vectors........OK [0 total vector count; avg ? term/freq vector fields per doc] {noformat} Shouldn't such segments not be removed automatically during the next *commit*/close of IndexWriter? But this would be another issue. In my opinion, we are fine with the current approach, the longer optimization time is rectified by the larger index size because of no compression anymore and the more heavyer initial merge without addRawDocument is only 30% slower (one time!). +1 for committing was (Author: thetaphi): I do not know if this is a bug in 2.9.0, but it seems that segments with all documents deleted are not automatically removed: {code} 2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - 4 of 14: name=_dlo docCount=5 2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - compound=true 2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - hasProx=true 2009-10-24 17:08:15,264 INFO org.apache.lucene.index.CheckIndex - numFiles=2 2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex - size (MB)=0.059 2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex - diagnostics = {java.version=1.5.0_21, lucene.version=2.9.0 817268P - 2009-09-21 10:25:09, os=SunOS, os.arch=amd64, java.vendor=Sun Microsystems Inc., os.version=5.10, source=flush} 2009-10-24 17:08:15,265 INFO org.apache.lucene.index.CheckIndex - has deletions [delFileName=_dlo_1.del] 2009-10-24 17:08:15,356 INFO org.apache.lucene.index.CheckIndex - test: open reader.........OK [5 deleted docs] 2009-10-24 17:08:15,356 INFO org.apache.lucene.index.CheckIndex - test: fields..............OK [136 fields] 2009-10-24 17:08:15,357 INFO org.apache.lucene.index.CheckIndex - test: field norms.........OK [136 fields] 2009-10-24 17:08:15,372 INFO org.apache.lucene.index.CheckIndex - test: terms, freq, prox...OK [1698 terms; 4236 terms/docs pairs; 0 tokens] 2009-10-24 17:08:15,373 INFO org.apache.lucene.index.CheckIndex - test: stored fields.......OK [0 total field count; avg ? fields per doc] 2009-10-24 17:08:15,373 INFO org.apache.lucene.index.CheckIndex - test: term vectors........OK [0 total vector count; avg ? term/freq vector fields per doc] {code} Shouldn't such segments not be removed automatically during the next merge? But this would be another issue. In my opinion, we are fine with the current approach, the longer optimization time is rectified by the larger index size because of no compression anymore and the more heavyer initial merge without addRawDocument is only 30% slower (one time!). +1 for committing > Remove deprecated Field.Store.COMPRESS > -------------------------------------- > > Key: LUCENE-1960 > URL: https://issues.apache.org/jira/browse/LUCENE-1960 > Project: Lucene - Java > Issue Type: Task > Reporter: Michael Busch > Assignee: Michael Busch > Priority: Minor > Fix For: 3.0 > > Attachments: lucene-1960-1.patch, lucene-1960-1.patch, > lucene-1960.patch, optimize-time.txt > > > Also remove FieldForMerge and related code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org