Re: how to rebuild a index corrupted?

2017-03-23 Thread Cristian Lorenzetto
Errata corridge/integration for questions related to previous my post I studied a bit this lucene classes for understanding: 1) setCommitData is designed for versioning the index , not for passing a transaction log. However if userdata is different for every transactionid it is equivalent . 2) NRT

Re: how to rebuild a index corrupted?

2017-03-23 Thread Cristian Lorenzetto
In the flow of the thinking ... i added a explanation for evoiding misunderstanding. I use TransactionId not for introduce transaction in lucene (a async commit excludes a traditional transaction system) but for signing segments with a extenal key (transactionid) , so if for a corruption error in

Re: how to rebuild a index corrupted?

2017-03-23 Thread Michael McCandless
You should be able to use the sequence numbers returned by IndexWriter operations to "know" which operations made it into the commit and which did not, and then on disaster recovery replay only those operations that didn't make it? Mike McCandless http://blog.mikemccandless.com On Thu, Mar 23, 2

Re: how to rebuild a index corrupted?

2017-03-23 Thread Michael McCandless
Lucene corruption should be rare and only due to bad hardware; if you are seeing otherwise we really should get to the root cause. Mapping documents to each segment will not be easy in general, especially if that segment is now corrupted so you can't search it. Documents lost because of power los

Re: how to rebuild a index corrupted?

2017-03-23 Thread Cristian Lorenzetto
I deduce the transaction range not using the segment corrupted but the corrected segments. The transaction id is incremental and i imagine segment are saved sequentelly so if it is missing the segment 5 , reading the correct segment 4 i can find the maximunn transaction id A , reading the segment 6

Re: how to rebuild a index corrupted?

2017-03-23 Thread Cristian Lorenzetto
Yes exactly. I saw, working in the past in systems using lucene (for example alfresco projects), lucene corruption happens sometimes and every time the building requires a lot of times ... so i thougth a way for accelerating the fixing of a corruption index. In addition there is a rare case not de

Re: how to rebuild a index corrupted?

2017-03-23 Thread Michael McCandless
If you use a single thread then, yes, segments are sequential. But if e.g. you are updating documents, then deletions (because a document was replaced) are recorded against different segments, so merely dropping the corrupted segment will mean you don't drop the deletions. Mike McCandless http:/

Re: how to rebuild a index corrupted?

2017-03-23 Thread Cristian Lorenzetto
You are right , but maybe it is possible to solve this problem. I can try :) i m not sure but in NRT , using a single commiter it is a single batch thread executing the commits so it might be sequential. I think your case is when 2 segments are not merged and contains changes in the same entities