David <[EMAIL PROTECTED]> wrote on 14/01/2007 20:08:05: > thanks, How do Lucene give each document an ID when the document is added? > Is the document ID unchanged until the document is deleted? >
Not exactly. When the first doc is added, it is assigned id 0. Next one assigned id 1, etc. When a doc is deleted, it is first only marked as such. So if there are 10 docs they have ids 0 to 9. Now doc 2 and 4 are deleted, - there is no change in ids. Next doc added is assigned id 10. Now if/when the segment containing the deleted docs is merged, all info on those docs is really removed, and docids are modified to remove any holes in the numbering - result is: 0 docs with ids 0 to 8. Now, next doc added gets id 9. Btw, segments are merged either as result of explicit call to optimize(), or implicitly following addDoc or indexWriter.close() (and depending on Lucene's merge policy). Docids are therefore internal, with unstable values. See also the FAQ - http://wiki.apache.org/jakarta-lucene/LuceneFAQ Especially "When is it possible for document IDs to change?" --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]