Re: [Neo] Lucene Index Corruption?

Adam Rabung Wed, 09 Dec 2009 07:37:50 -0800

Hi,
Lots of questions :)

1. This iterator will just prevent duplicates from being returned from
the iterator?  If there's a condition (bug in my code) that causes
shutdown w/ open transactions, will the Lucene indexes continue to
double until they're huge?


2. Would it be possible to detect this situation, and rebuild the
indexes?  I guess this is a losing cause if the app is regularly
corrupting the data.

3. Could you allow me to close transacations from different threads?
Yesterday, I wrote something that tracks tx opens and closes, and
could iterate through all open transactions and call finish() on them.
 But TransactionImpl.finish seems to assume the calling thread is the
creating thread, which is not the case here.

4. Better yet, expose API for me to force-finish all open
transactions?  I'd rather have a botched transaction than a corrupt
index.

5. Is the only condition for this open transactions + a Lucene
shutdown (via shutdown() OR abrubt process termination)?  In further
testing, it seems I can't reproduce the problem w/ a clean or dirty
shutdown if all transactions are closed.

6. I assume your iterator fix will make b11?  What are the chances the
root cause will be fixed in b11?  Do you have a tentative release date
for b11?

Thanks,
Adam


On Wed, Dec 9, 2009 at 9:02 AM, Mattias Persson
<matt...@neotechnology.com> wrote:
> Hi Adam,
>
> We're aware of such problems and I just now committed a fix which
> basically is a cover-up until those bugs are fixed... the iterable
> from getNodes() now runs through a filter (lazily before each next())
> so your problem should go away.
>
> 2009/12/8 Adam Rabung <adamrab...@gmail.com>:
>> Hi,
>> I've recently run into problems with indexes becoming corrupt after
>> unclean shutdowns. Basically:
>> 1. Transaction 1 writes some data
>> 2. Transaction 2 reads some data, and is left open
>> 3. The database is shut down, with warnings about an open transaction
>> 4. The database is opened.  Recovery executes, but it appears the
>> Lucence indexes are "doubled" - that is, where we used to have key =>
>> (value1), we now have key => (value1, value1).
>>
>> I've attached a JUnit test case that hopefully reproduces this for
>> you.  I'm on Java 5, Mac OS 10.5, neo-1.0-b10.jar, and
>> index-util-0.8.jar
>>
>> Obviously, the first step on my end is to make sure any open
>> transactions are closed before attempting a shutdown.  However, I'm
>> able to pretty reliably reproduce this problem in a much scarier way -
>> just killing a running Neo process via the Eclipse "Console" view "red
>> square" process stop button.  Amazingly, Eclipse doesn't properly shut
>> down processes properly when this button is used, so I can't count on
>> shutdown hooks:
>> https://bugs.eclipse.org/bugs/show_bug.cgi?id=38016
>>
>> What expectations should I have for corruption when a database +
>> indexes are .shutDown() with open transactions?
>> What expectations should I have for corruption when a database +
>> indexes are terminated abruptly (Eclipse Console, power outage, etc)?
>> Beyond proper transaction management, and ensuring shutDown() is
>> called, is there anything I should be doing to help protect this data?
> I don't know if there's anything you could do. The problem is that we
> can't at the moment make lucene participate (I mean _really_
> participate) in a 2 phase commit together with the NeoService, but we
> will fix these issues in a near future.
>
> Until then, I think you'll be fine with this new fix
>>
>> Thanks,
>> Adam
>>
>> _______________________________________________
>> Neo mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>>
>
>
>
> --
> Mattias Persson, [matt...@neotechnology.com]
> Neo Technology, www.neotechnology.com
> _______________________________________________
> Neo mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo] Lucene Index Corruption?

Reply via email to