Hi,

so I think the issue is now fixed (crossing fingers). I write this mail in order to give some info about what has been done.

First, I would like to thank Kiran who seconded me during the long process of trying to get the bug fixed. I would like to thank Sumit Goyal and Murali Krishna who found the bug and submitted a test to exhibit the problem. It was incredibly useful.

It took me a month - even if I didn't worked only on this problem, as I also had to deal with some client at the same time -, but I can tell that it was the most complicated problem I have ever had to work on. I'm not 100% sure what exactly was the problem, I'm afraid that I just put it aside when I removed the AvlTree implentation to replace it with some simpler data structure.

When the issue was first created, I thought it was a problem with MINA, as we had a concurrency issue with version 2.0.0-M4. Then I bumped up the version to 2.0.0-M5, and closed the initial issue. Sadly, after some more tests, another issue showed up, with a different error message, and the JIRA was reopened.

A test was provided, which was the most useful thing possible, as it helped me to reproduce the error on my computer. We started some more debugging session, and the problem occurred quite immediately.

But first, we have seen another problem : an index was growing without bound (SubLevel). We debugged the server with Kiran, and found the reason. At least, it was easy to fix, even if it wasn't related to the initial issue. Lesson learned : when something is wrong, it's very likely that while investigating the problem, you find some other problems around :). The bad news is that many parts of the server need more thorough tests :/

While reviewing the index handling, as the problem seemed to be in this area (some few stackTrace helped to show that, at least : sometime old school debugging with printf come to the rescue... ), we also cleaned many small things, like useless deletion of non existing elements in index.

Then we started to think about the way the server stores data on disk. Basically, you have N threads updating many index and the master table, and all those data read and write is done in a thread-safe part of the server (Jdbm backend). As every access to JDBM is synchronized, we discarded the idea that JDBM was buggy at this point. It narrowed the issue to what is in between the interceptor chain and JDBM.

Then we saw that the place we massage index was not thread safe. In other words, as we have to update around 10 index when updating an entry, it must be done in a way that guarantee that the index can't be fooled by another thread. This was not the case. We then thought we found the reason of the concurrent problem. We fixed that by adding a hell of synchronized all over this part of the code.

Great success ! When the multithreaded test was launched, instead of failing after a few hundreds of updates, it was now running well for thousands of updates (I went up to 700 000 updates before getting a breakage).

That was the good news. The bad news was that the problem was still around, but deeper. At least, we know for sure that we removed a real concurrent issue in the code. So what could be the problem ?

When we are storing data in indexes, we store one key associated to many values (for instance, an ObjectClass can be related to many entries). In order to manage this, we use a ordered data structure to store those values. Namely, depending on the number of values, either an AvlTree or a BTree. This AvlTree has to be serialized and deserialized in order to be stored on disk. This is where I thought we could have an issue : the de/serialization was done outside of the synchronised part, which is not a problem per se, but we send and get back a byte[] from JDBM. What is fishy here is that JDBM caches some data, and returns a *reference* to the data . Getting a reference on a byte[] means that some other thread can perfectly modify this byte[] at the same time, with a potential impact on the expected result. It sounded like a good catch, so I modified JDBM to get it return a copy of the byte[]. It didn't changed anything, I still get the error after a few hundred thousands updates.

Next step was to start suspecting the AvlTree implementation. I decided that rewriting this part was a sane idea. What a fool ... AvlTree is a complex data structure, and writing an iterative implementation needs time, and a hell of tests. Basically, as I was stuck with no other ideas, I followed this idea, even if it was crazy. (some little voice in my head was telling me I'm not any more a student, and this was a lost of time, but the other side of my personality told me that I was smart enough to do better than what we have ...). The bad thing about Avl trees is that documentation about it is *very* rough on the internet. There are some C implementations, but they use pointers, something we don't have in Java, making it quite complex to implement. We also have some specific needs, like we want to be able to move backward and forward in the tree, adding more constraints.

After around 2 weeks and a half, having a partially working AVLTree impelmentation, I just had a better idea. In the mean time, Kiran has implemented an in-jdbm serialization, removing the previous potential problem with byte[] copies.

So the new idea was that we don't need AvlTree, we just need a sorted array, which was less costly when searching data into it (O(N) for a sorted array, compared to a 1.44O(N) for an AvlTree). Sure, insertion and removal was more costly, as it needs an array copy, but as we don't do a lot of addition or removal, that was a good trade. More important, it save a lot of memory, and serialization was way easier (memory shring by 4, for references to an entry, as we have less pointers (left and right child, previous and next leaf, and the balance)). Less memory consumption means more cache, less garbage collection, less pointer update, faster read and write on disk.

I ended by writing this ArrayTree implementation, which took me three days, with all the necessary tests, and a bit of problems with the current Cursor interface (there are some semantic I really don't like in this area). Then I just had to substitute AvlTree with ArrayTree, and voilĂ .

Tests went up to 2 000 000 updates, without any problem. Twice. So I closed the issue.

My understanding is that the previous AvlTree had a problem in the way it managed next/previous index. I didn't spent time in analysing the code, it was too complex. Last, not least, I added a lot of debug logs in the server, and tried to see what the logs gave. That led me to grep/sed a few gigabytes of logs, with little success. It just sucked time, with no result except to eliminate other options.

Last, not least, what made me thing that the bug was a problem in AvlTree in some very specific cases, is that we always had a NPE in the very same line, and that the analyze of the byte[] taken from the JDBM backend shown that there were some missing elements in the serialized tree. As the tree was correctly serialized (I added some check in the serialize() method : the byte[] was deserialized before being stored in JDBM to be sure that the serialization was ok, and I had no problem in it), that mean something was broken while rebuilding the AvlTree.

It now seems to work, but I'm not 100% sure the problem is really fixed. I think I just pushed some dust in the trash bin, expecting that it was not pushing them under the rug...

In any case, this part of the server needs a deep check, as I saw many potential problems that needs to be addressed. We probably have to run this multi-threaded test on a more powerful computer, to see if it is really fixed.


--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org


Reply via email to