John - I like this approach as well - seems like it should do the trick. I'll merge these changes into our fork at some point and test them out.
Thanks! - Blake On Tue, Sep 29, 2015 at 11:02 AM, K. John Wu <[email protected]> wrote: > Hi, Blake, > > Thanks for the report. My preferred solution would be to avoid > calling fileManager::unload inside the constructors of > ibis::fileManager::storage objects, but instead having the callers > responsible for allocating the necessary memory. I have a preliminary > implementation checked into the SVN as revision 833. If you are > interested, you can take a look and see if it address the problem you > are experiment a little more thoroughly. > > John > > > > On 9/28/15 11:37 AM, Blake Caldwell wrote: > > John - > > > > I'm hitting a deadlock on a slightly modified fork of version 1.3.8, > > but from what I can tell, the latest version of Fastbit would still > > have the same problems. Upon looking into it further, it seems that > > there are several code paths where a thread deadlocks itself, and > > subsequently, all threads, when the cache size is approaching `maxBytes`. > > > > In the case that I ran into, the `storage` constructor was trying to > > store a file in memory, but when it realizes that there's not enough > > room in the cache, it first tries to lock the `fileManager`'s mutex. > > The problem is, that thread already locked the mutex in `getFile`. > > > > A simple fix is to convert `fileManager`'s mutex to a recursive one, > > so a lock attempt from within the same thread that already owns the > > lock would succeed. This does add some overhead - I haven't measured > > how much. It also runs the risk of breaking other sections of the code > > that I'm not familiar with, which might want to unlock that mutex if > > it's locked. The recursive mutex that's locked twice would need to > > unlock it twice, in that case. > > > > I also considered using the standard mutex type, but passing down a > > boolean that says whether the current thread has locked it. Once I saw > > how far that reached, I opted for the easy recursive mutex. > > > > I show the code path that deadlocked for me, along with the recursive > > mutex initialization here: > > > > https://gist.github.com/wblakecaldwell/89f29ddb1e98fedda245 > > > > I'd love to hear your thoughts - Thanks! > > > > - Blake Caldwell > > > > > > _______________________________________________ > > FastBit-users mailing list > > [email protected] > > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > > > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >
_______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
