The lucene config module in 2.4 clusters a org.apache.lucene.store.RAMDirectory
I don't have a huge amount of knowledge about the internals of lucene (not yet, anyway), so I don't really have any cool insights about what you suggest. But, maybe someone on the dev list might? --Orion On Jun 22, 2007, at 11:26 AM, Kunal Bhasin wrote: > Hey Orion, > > Lucene came up in Italy training and we identifed a use case which > I think makes a lot of sense and can be a pain point for many > Lucene Users. > > A question first: > > Do we cluster (in the work we have done so far in our Lucene config > module) just the RAMIndex or bot RAMIndex and DiskIndex? > > If we cluster both, what is the strategy of clustering DiskIndex? > > The pain point identified was that when the index size grows > exponentially (happens a lot it seems ;)), people like to keep > their indexes on disk. Now, the problem with distributing is that > the nanatural file-based locking does not guarantee that the index > won't get corrupted (as two threads could have updated the same > stream and last one wins). I think it would be great if Terracotta > couls provide distributed locking and thread coordination in this > case (acquire same lock on each index) with minimal contention to > guarantee that indexes don't get corrupted. > > I know that they could always rebuild the index from disk in > memory, but for very large data, that takes a lot of time. > > Also, Terracotta itself provides eviction to disk so RAMIndex+TC > should be good enough, but I understand (and I might be wrong) that > the way Lucene is designed, if someone is already using the > DiskIndex, it is a lot of rework (almost a complete redesign) to > move to RAMIndex. > > Any thoguhts? _______________________________________________ tc-dev mailing list [email protected] http://lists.terracotta.org/mailman/listinfo/tc-dev
