I really want to use document numbers as a secondary key in my object storage. If I got it all right, the main problem is deleted documents and optimization. Are there any other issues?

All my tests tells me optimization does this:

Legend:
action
docNum  doc.toString()

for (int i=0; i<4; i++) indexWriter.add(documentFactory(i);
0       Document<stored/uncompressed,indexed<f:0>>
1       Document<stored/uncompressed,indexed<f:1>>
2       Document<stored/uncompressed,indexed<f:2>>
3       Document<stored/uncompressed,indexed<f:3>>

indexReader.deleteDocument(1);
0       Document<stored/uncompressed,indexed<f:0>>
1       DELETED
2       Document<stored/uncompressed,indexed<f:2>>
3       Document<stored/uncompressed,indexed<f:3>>

indexWriter.add(documentFactory(4);
0       Document<stored/uncompressed,indexed<f:0>>
1       DELETED
2       Document<stored/uncompressed,indexed<f:2>>
3       Document<stored/uncompressed,indexed<f:3>>
4       Document<stored/uncompressed,indexed<f:4>>

indexWriter.optimize();
0       Document<stored/uncompressed,indexed<f:0>>
1       Document<stored/uncompressed,indexed<f:2>>
2       Document<stored/uncompressed,indexed<f:3>>
3       Document<stored/uncompressed,indexed<f:4>>

Given this is true at all times, would it not be fairly easy to inspect the index prior to optimization in order to find out how document numbers will change during optimization?

It might end up beeing really expensive to update virtually all references to documents in the object storage, and the current thread on update/replace document on the dev-list mde me look in to the problem.

I don't know too much about the file format and SegementMerger (as far as I know, this is the class that handle optimization), but what is it that makes it so hard to insert a document at the position of a deleted one? Tracing the code a bit gave me the feeling it should be possible to make exceptions for deleted documents. Something like an alternative merge policy (or so) for segments containing the document to be assigned specific document numbers. Or? If it's not a waste of time, I'd be happy to give it a try.


--
karl



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to