Marcel Reutegger wrote:
Christoph Kiehl wrote:
I like the idea of having a transactional index but I don't think it's
a good idea to read this index from a binary property in a database,
because in our case we've got a fairly large repository where we got
index files with a size of 40MB. As far as I understand you have to
transfer 40MB to the database on every index change that gets
committed. Am I right?
In general, this is correct. but lucene is designed in a way that it
never modifies an existing index file. if you have a 40 MB index segment
file and you delete a document within that index, lucene will simply
update a small other file which is kept along the index called
<segment-name>.del. Adding a new document to an existing index segment
is not possible, in that case lucene will create a new segment.
Ok. To get this working, you have to create at least one segment per
transaction, right? And index merging could be done in background? Sounds really
interesting. But if the blob values are cached locally they have to be
downloaded on startup first before the index starts to be fast. Or does the blob
cache survive restarts? Lots of questions ;)
Cheers,
Christoph
- Re: Jackrabbits own FileSystem and unit tests Christoph Kiehl
-