John - I'm hitting a deadlock on a slightly modified fork of version 1.3.8, but from what I can tell, the latest version of Fastbit would still have the same problems. Upon looking into it further, it seems that there are several code paths where a thread deadlocks itself, and subsequently, all threads, when the cache size is approaching `maxBytes`.
In the case that I ran into, the `storage` constructor was trying to store a file in memory, but when it realizes that there's not enough room in the cache, it first tries to lock the `fileManager`'s mutex. The problem is, that thread already locked the mutex in `getFile`. A simple fix is to convert `fileManager`'s mutex to a recursive one, so a lock attempt from within the same thread that already owns the lock would succeed. This does add some overhead - I haven't measured how much. It also runs the risk of breaking other sections of the code that I'm not familiar with, which might want to unlock that mutex if it's locked. The recursive mutex that's locked twice would need to unlock it twice, in that case. I also considered using the standard mutex type, but passing down a boolean that says whether the current thread has locked it. Once I saw how far that reached, I opted for the easy recursive mutex. I show the code path that deadlocked for me, along with the recursive mutex initialization here: https://gist.github.com/wblakecaldwell/89f29ddb1e98fedda245 I'd love to hear your thoughts - Thanks! - Blake Caldwell
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
