Do NOT rely on the Lucene document number. It changes periodically. As I understand it the general algorithm is that each doc gets an ID one greater than the current max doc ID at INDEX time. However, when you delete documents and optimize your index, the document IDs change. Simplistically, say you have docs indexed with IDs 1, 2, 3, 4, 5, remove 2 and reoptimize. You then have IDs 1, 2, 3, 4 where 2, 3, 4 were 3, 4, 5 respectively.
WARNING: I have no idea whether that's exactly how it works. The point is that the doc IDs change. I wouldn't count on trying to match any algorithm that Lucene uses.... But you don't need to anyway. Just assign your own document ID that *you* can guarantee doesn't change (no relation to the Lucene ID) and store that wherever you want, then search on that. I believe you'll find that you can search fast enough on such an ID that you won't notice the time. At any rate, that's how I'd start out and only get fancier if performance proves unacceptable. Best Erick On 11/5/06, mukkamalla rama kumar <[EMAIL PROTECTED]> wrote:
Hi, How is this document number assigned to documents. Can i give my own document number. I would like to get the document number for a particular file that i added to an index. --------------------------------- Find out what India is talking about on - Yahoo! Answers India Send FREE SMS to your friend's mobile from Yahoo! Messenger Version 8. Get it NOW