Thank you David for looking into this issue. I am rather not an expert on this area but looking through the existing code I would have expected that a fix should rather go into the NSConcretePointerFunctions.m or NSConcretePointerFunctions..h file. There we seem to set up the matching functions when working with the different pointer semantics. I understand that this mechanism is a lot more clumsy than what is possible with C++ templates and the compiler support that is offered there for weak pointers.
But I may be completely off here. Fred > Am 24.11.2019 um 13:27 schrieb David Chisnall <[email protected]>: > > Hi, > > I have spent some time looking at this issue. It appears to be an issue in > how the GSIMap macros are used in NSConcrete{Hash,Map}Table. This is > horribly complex code to try to fix, because: > > - It is generic code that is using C macros rather than type-safe C++ > templates, so has a load of casts and is very difficult to follow where any > of the parts actually work. > > - It is trying to implement ARC semantics, but the raw operations are retain > and release (which doesn’t really match to the behaviour of weak references), > rather than read / assign / delete. At a high level, ARC defines operations > in terms of ownership by variables, manual RR defines operations in terms of > operations on objects. > > One specific problem in this instance is that GSIMap was that GSIMap was > passing node->key to the HASH macro, but in ARC mode a weak object pointer > can’t just be loaded, you need to use the correct read barrier. > > There was a comment from 2013 from RFM saying that the old code caused double > retains, but unfortunately his new code was also incorrect and did not use > the weak read / write barriers at all in the weak reference case, completely > breaking NSMapTable with weak keys or values. > > I have now fixed it at least enough that the test passes and submitted a PR: > > https://github.com/gnustep/libs-base/pull/84 > > If I were writing this code from scratch, I’d make a templated class that > took C++ smart pointers as arguments and then have an Objective-C class > cluster, one for each of the valid combinations of NSPointerFunctionOptions. > This would make it impossible for bugs like this to creep in, because any > access would implicitly go via the smart pointer wrappers (if the compiler > supports ARC, then the ObjC ones would be fairly trivial). WinObjC uses a > similar approach, though with run-time specialisation. > > Note that, even when correct, the current code is quite dependent on the ARC > optimiser for performance. It will repeatedly call objc_loadWeak(), which > returns an autoreleased object. The ARC optimiser can hopefully transform > some of these into objc_loadWeakRetained() calls followed by objc_release() > calls and avoid filling up the autorelease pool, but I’m not convinced that > it will for this code. If someone were willing to rewrite GSIMap to expose > an interface cleanly in terms of operations on variables that hold references > rather than on objects, then it would be possible for the hash function, for > example, to be defined as objc_loadWeakRetained(), -hash, objc_release() > directly, which would avoid touching the autorelease pool. > > For reference, a std::unordered_map with strong or weak object pointers Just > Works™ with clang. If we were using C++ and ARC in GNUstep, the NSMapTable > and NSHashTable code would be a fraction of their current size and a lot more > correct. There are various reasons that I rarely contribute to GNUstep > anymore, but the fact that the community actively prevents me from using > tools that make my life easier was a significant one. > > As a high-level observation, not using ARC within GNUstep itself means that > we ship bugs. This code has been broken since at least 2013, possibly > earlier. If GNUstep developers are not using modern Objective-C features > then anyone who tries to use them will find that things are broken. > > David > > P.S. I spent about an hour of this debugging process fighting the tests > suite. Please can someone who understands the test suite make it provide > simple instructions for building a single test? Or even doing a -j32 build > of all of the tests, rather than waiting for each one to run before moving on > to the next one. It takes me about 10 seconds to compile -base and then a > few minutes to wait for gmake check -j32 to build the test that I care about. > > >> On 11 Nov 2019, at 22:16, Fred Kiefer <[email protected]> wrote: >> >> Hi David, >> >> you are aware that Frederik is talking about the travis CI system on Github >> where his tests are failing? Either the whole build system is set up >> incorrectly or we have a general issue for week pointers. >> We probably should not expect these tests to work with gcc, maybe there is a >> way to disable them for this setup? >> >> Fred >> >>> Am 11.11.2019 um 18:36 schrieb David Chisnall <[email protected]>: >>> >>> >>> Where do the tests fail currently? These look sufficiently simple that >>> they should just be calling a few of the ARC functions. The implementation >>> of these classes is in NSConcrete{PointerFunctions,HashTable}.m and their >>> interaction with the runtime is defined here: >>> >>> https://github.com/gnustep/libs-base/blob/3d77109fb634f11e5d51dd9b13002102ab419729/Source/NSConcretePointerFunctions.h#L33 >>> >>> From what you've said, I wonder if you're compiling GNUstep-base without >>> access to the libobjc2 headers? If so, then it will provide the >>> old-runtime versions of these functions, where weak is a synonym for >>> unsafe_unretained (these classes on macOS did this back in the 10.7 or so >>> days). Can you put a #error in NSConcretePointerFunctions.h on line 33 and >>> check that this breaks the build for you? If the build still works, then >>> you're building GNUstep in a configuration that does not provide weak >>> pointer semantics. >>> >>> It might be better, now, to have these methods error at run time if this >>> support is not available, rather than give an unsafe version. >>> >>> If you are compiling the correct version, the most likely cause is that >>> we're using objc_storeWeak() on a location in memory that does not contain >>> either 0 or a valid object pointer (e.g. if NSHashTable is getting unzeroed >>> memory). Please can you put a breakpoint on objc_storeWeak and see what >>> happens? >>> >>> David >>> >>> On 11/11/2019 15:00, Frederik Seiffert wrote: >>>> Hi all, >>>> I’ve opened a pull request with some very simple tests for the >>>> NSHashTable/NSMapTable weak object support: >>>> https://github.com/gnustep/libs-base/pull/80 >>>> Unfortunately they are failing on the CI, so it seems that the issues >>>> described below are not specific to our setup. I also confirmed the tests >>>> pass when run against Apple’s Foundation. >>>> Does anyone have knowledge of the weak pointer support in GNUstep and >>>> could take a look at this? >>>> Thanks! >>>> Frederik >>>>> Am 07.10.2019 um 16:51 schrieb Frederik Seiffert <[email protected] >>>>> <mailto:[email protected]>>: >>>>> >>>>> Hi David, >>>>> >>>>>> While replacing the hash table, I managed to turn it into a >>>>>> reproduceable fault on at least one machine. The bug is quite subtle: >>>>>> >>>>>> We have a map from objects to weak reference structures. >>>>>> The weak reference structure points to the object. >>>>>> When we delete the last weak reference, we delete the object from the >>>>>> map. >>>>>> We delete the object from the map using the object as the key. >>>>>> But by the time a weak reference is deallocated, its object pointer has >>>>>> been zeroed... >>>>>> ...so we always remove a random element from the map and leave a >>>>>> dangling pointer in the table. >>>>> >>>>> Should this already be fixed on the latest libobjc2 master? >>>>> >>>>> Going back to my original issue about NSHashTable with weak objects, I’m >>>>> still seeing it crash with the latest libobjc2 master and also using the >>>>> arc-cxx branch. >>>>> >>>>> It reproduces quite easily with the following code (compiled with ARC), >>>>> which crashes either directly in addObject: on the first invocation (when >>>>> using weakObjectsHashTable), or in -allObjects on the second or third >>>>> invocation (when using hashTableWithWeakObjects). >>>>> >>>>> static __strong NSHashTable *_hashTable = nil; >>>>> static __strong NSObject *_test = nil; >>>>> - (void)testHashTable >>>>> { >>>>> @autoreleasepool { >>>>> if (!_hashTable) { >>>>> _hashTable = [NSHashTable weakObjectsHashTable];// or >>>>> hashTableWithWeakObjects >>>>> _test = [NSObjectnew]; >>>>> [_hashTable addObject:_test];// crash 1 >>>>> }else { >>>>> _test =nil; >>>>> } >>>>> >>>>> NSLog(@"HashTable %@ (_test: %@)", _hashTable.allObjects, _test);// >>>>> crash 2 >>>>> } >>>>> } >>>>> >>>>> Similarly, NSMapTable crashes as well in the second invocation of the >>>>> following function, although I’m not sure if this is the same root cause >>>>> as the hash table crash: >>>>> >>>>> static __strong NSMapTable *_mapTable = nil; >>>>> static __strong NSObject *_test = nil; >>>>> - (void)testMapTable >>>>> { >>>>> @autoreleasepool { >>>>> if (!_mapTable) { >>>>> _mapTable = [NSMapTable strongToWeakObjectsMapTable]; >>>>> _test = [NSObjectnew]; >>>>> [_mapTable setObject:_test forKey:@"test"]; >>>>> }else { >>>>> NSLog(@"obj pre: %@", [_mapTable objectForKey:@"test"]);// >>>>> crash!!! >>>>> _test =nil; >>>>> NSLog(@"obj post: %@", [_mapTable objectForKey:@"test"]); >>>>> } >>>>> >>>>> NSLog(@"MapTable %@ (_test: %@)", >>>>> _mapTable.dictionaryRepresentation, _test); >>>>> } >>>>> } >>>>> >>>>> I’m pasting the stack traces below. I’d appreciate if anyone can shed >>>>> some light on this. >>>>> >>>>> Thanks! >>>>> Frederik >>>>> >>>>> >>>>> >>>>> *NSHashTable crash using weakObjectsHashTable:* >>>>> >>>>> * frame #0: 0xeca8c1d0 libart.so`art_sigsegv_fault >>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int, >>>>> siginfo*, void*) + 372 >>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*, >>>>> void*) (.llvm.650222801) + 43 >>>>> frame #3: 0x625bd6af >>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623 >>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1 >>>>> frame #5: 0xd1e56762 libobjc.so`objc_msgSend at >>>>> objc_msgSend.x86-32.S:120 >>>>> frame #6: 0xd14f69cd libgnustep-base.so`hashObject(item=0xcf40c100, >>>>> size=0x00000000) at NSConcretePointerFunctions.m:126 >>>>> frame #7: 0xd1400a72 >>>>> libgnustep-base.so`pointerFunctionsHash(PF=0xd0ea7900, item=0xcf40c100) >>>>> at NSConcretePointerFunctions.h:180 >>>>> frame #8: 0xd13fd9b6 >>>>> libgnustep-base.so`GSIMapBucketForKey(map=0xd0ea78d4, key=GSIMapKey @ >>>>> 0xff99f2a4) at GSIMap.h:410 >>>>> frame #9: 0xd1400ce3 >>>>> libgnustep-base.so`GSIMapAddNodeToMap(map=0xd0ea78d4, node=0xcf40c0f0) at >>>>> GSIMap.h:453 >>>>> frame #10: 0xd13fcb64 libgnustep-base.so`GSIMapAddKey(map=0xd0ea78d4, >>>>> key=GSIMapKey @ 0xff99f334) at GSIMap.h:1118 >>>>> frame #11: 0xd13fe82f libgnustep-base.so`-[NSConcreteHashTable >>>>> addObject:](self=0xd0ea78d4, _cmd="C", anObject=0xd0efd14c) at >>>>> NSConcreteHashTable.m:848 >>>>> frame #12: 0xd1e2e7ff libnative-lib.so`::-[ObjectiveCTest >>>>> testHashTable](self=0xcf40c0e4, _cmd="V\x1c") at ObjectiveCTest.mm:78 >>>>> >>>>> *NSHashTable crash using hashTableWithWeakObjects:* >>>>> >>>>> * frame #0: 0xeca8c1d0 libart.so`art_sigsegv_fault >>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int, >>>>> siginfo*, void*) + 372 >>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*, >>>>> void*) (.llvm.650222801) + 43 >>>>> frame #3: 0x625bd6af >>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623 >>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1 >>>>> frame #5: 0xd1ea975d libobjc.so`objc_msgSend at >>>>> objc_msgSend.x86-32.S:120 >>>>> frame #6: 0xd13170bd libgnustep-base.so`-[GSInlineArray >>>>> initWithObjects:count:](self=0xcf4fd184, _cmd="\xffffff81", >>>>> objects=0xff99f320, count=1) at GSArray.m:420 >>>>> frame #7: 0xd131a41d libgnustep-base.so`-[GSPlaceholderArray >>>>> initWithObjects:count:](self=0xcf4fd184, _cmd="\xffffff81", >>>>> objects=0xff99f320, count=1) at GSArray.m:1199 >>>>> frame #8: 0xd13c2b80 libgnustep-base.so`-[NSConcreteHashTable >>>>> allObjects](self=0xd0da9a14, _cmd="m\x02") at NSConcreteHashTable.m:875 >>>>> frame #9: 0xd1df7844 libnative-lib.so`::-[ObjectiveCTest >>>>> testHashTable](self=0xcf391444, _cmd="V\x1c") at ObjectiveCTest.mm:84 >>>>> >>>>> *NSMapTable crash:* >>>>> >>>>> * frame #0: 0xeca8c1d1 libart.so`art_sigsegv_fault + 1 >>>>> frame #1: 0xeca8c774 libart.so`art::FaultManager::HandleFault(int, >>>>> siginfo*, void*) + 372 >>>>> frame #2: 0xeca8c49b libart.so`art::art_fault_handler(int, siginfo*, >>>>> void*) (.llvm.650222801) + 43 >>>>> frame #3: 0x625bd6af >>>>> app_process32`___lldb_unnamed_symbol22$$app_process32 + 623 >>>>> frame #4: 0xefee7c50 libc.so`___lldb_unnamed_symbol2$$libc.so + 1 >>>>> frame #5: 0xd1e29c6d >>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined] >>>>> isPersistentObject(obj=0xd0bfb27c) at arc.mm:291 >>>>> frame #6: 0xd1e29c66 >>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined] >>>>> retain(obj=0xd0bfb27c) at arc.mm:296 >>>>> frame #7: 0xd1e29c66 >>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(id) [inlined] >>>>> objc_retain(obj=0xd0bfb27c) at arc.mm:587 >>>>> frame #8: 0xd1e29c62 >>>>> libobjc.so`::objc_retainAutoreleasedReturnValue(obj=0xd0bfb27c) at >>>>> arc.mm:581 >>>>> frame #9: 0xd1175a04 libnative-lib.so`::-[ObjectiveCTest >>>>> testMapTable](self=0xcf66c064, _cmd="X\x1c") at ObjectiveCTest.mm:97 >> >> > >
