> I might dig deeper once I find the time, but perhaps someone already > > familiar with the code might want to take a look at it before I waste a > > week on it ;-) > > > > The issue is the change in ld.so/library_subr.c rev 1.34. If you back that > change out, the crash disappears. > > The problem is that no one makes changes to the linkages inside ld.so out > of boredom: there was some previous program that crashed without that > change, but the details weren't documented or preserved in a regress/ > program. I've made a couple stabs at reproducing the original program so > that we can be sure to keep it fixed when fixing this, but haven't been > able to pin down a case where the committed change solved the problem. If > you can figure that out, I would gladly buy you a beer or three. Elsewise > we're reaching the point where we back that change out and wait for someone > complain... :-(
I managed to come up with a case where a double decrement takes place, when running with the change from 1.34 undone. There are two libraries, l1 and l2. L1 depends on l2; l2 has no deps. Then you do this dance: 1. dlopen l1 2. dlopen l2 3. dlclose l2 4. dlopen l2 5. dlclose l2 6. dlclose l1 So first (1) we load l1, and this also loads l2 as a dep. Now (2) we explicitly open l2, but since it's already loaded, l2 makes a group reference to l1. We (3) close the handle we just opened, and this decrements the grouprefcount on l1, but l1 is still on l2's list of grouprefs (they don't get removed prior to 1.34). L2 also won't be unloaded since it's still needed as a dep by l1. (4) Open l2 again explicitly. Again l2 makes a groupref to l1, so now l1 appears on l2's group list twice. So the next close (5) decrements l1's grouprefcount twice, making it negative... it's around here where l1 gets unloaded. When we try to close it in (6), it's not there. Fwiw, I also saw some segfaults which looked very much like the use after free I had in my SDL2 case. But they seem much more frequent when the change from 1.34 is there. I don't have the time investigate that tonight... I've put together a little test case, see http://guu.fi/ldtest.tgz for the code. -Henri
