Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

robert engels Thu, 08 Jan 2009 14:57:42 -0800

The way we've simplified this that every document has an OID. Itsimplifies updates and delete tracking (in the transaction log).


On Jan 8, 2009, at 2:28 PM, Marvin Humphrey (JIRA) wrote:

[ https://issues.apache.org/jira/browse/LUCENE-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662107#action_12662107 ]
Marvin Humphrey commented on LUCENE-1476:
-----------------------------------------

Mike McCandless:
Commit is for crash recovery, and for knowing when it's OK to delete
prior commits. Simply writing the files (and not syncing them), and
perhaps giving IndexReader.open the SegmentInfos to use directly (and
not writing a segments_N via the filesystem) would allow us to search
added docs without paying the cost of sync'ing all the files.
Mmm. I think I might have given IndexWriter.commit() slightlydifferentsemantics. Specifically, I might have given it a boolean "sync"argument
which defaults to false.
Also: brand new, tiny segments should be written into a RAMDirectory
and then merged over time into the real Directory.
Two comments. First, if you don't sync, but rather leave it up tothe OS whenit wants to actually perform the actual disk i/o, how expensive isflushing? Canwe make it cheap enough to meet Jason's absolute change raterequirements?
Second, the multi-index model is very tricky when dealing with"updates". How
do you guarantee that you always see the "current" version of a given
document, and only that version? When do you expose new deletes intheRAMDirectory, when do you expose new deletes in the FSDirectory,how do youmanage slow merges from the RAMDirectory to the FSDirectory, how doyou manage
new adds to the RAMDirectory that take place during slow merges...
Building a single-index, two-writer model that could handle fastupdates whileperforming background merging was one of the main drivers behindthe tombstone
design.
BitVector implement DocIdSet
----------------------------

                Key: LUCENE-1476
URL: https://issues.apache.org/jira/browse/LUCENE-1476
            Project: Lucene - Java
         Issue Type: Improvement
         Components: Index
   Affects Versions: 2.4
           Reporter: Jason Rutherglen
           Priority: Trivial
        Attachments: LUCENE-1476.patch

  Original Estimate: 12h
 Remaining Estimate: 12h
BitVector can implement DocIdSet. This is for makingSegmentReader.deletedDocs pluggable.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [jira] Commented: (LUCENE-1476) BitVector implement DocIdSet

Reply via email to