在 2022/10/25 04:50, Jacek Caban 写道:

Those 4 points describes problems that you solve in the new threading model, but there is no reason they can't be fixed for existing threading models. In fact, ideally they would be fixed for all threading models. Except now we need to worry about one more threading model, meaning that future bugs will be even harder to fix.


Since below is no longer related to GCC itself and can be narrowed down to mingw-w64, I am stopping CC'ing others.


All these four issues boil down to the fact that TLS callbacks / DLL entrypoints are invoked at a wrong moment - after exit callbacks have been invoked, and all the other threads have been killed. There are certainly people who just make calls to `_exit()` and don't care about destructors, but eventually get deadlocks instead, such LLVM.

That being said, if standard conformance is desired, it is necessary to re-implement `atexit()`, `exit()`, `_exit()`, `quick_exit()`, `at_quick_exit()`, etc.


Then there is another fun fact: We already know that, in C++, destructors of static objects are executed in the reverse order of completion of their construction. But, for thread-local objects - if we have a static object `SA`, whose constructor contructs another static object `SB`, followed by a thread-local object `TL`, and returns normally, then, the order of completion initialization is `SB` -> `TL` -> `SA`; however, according to the C++ standard ([basic.start.term]/2, ISO/IEC 14882) the destruction of thread-local objects shall happen before the destruction of static objects, so the order of destruction is actually `TL` -> `SA` -> `SB`.

So, for `__cxa_thread_atexit()` or emutls, it's wrong to register cleanup of thread-local data with `atexit()`; it will lead to a wrong order of destruction.

This illustrates why a new thread model is a must-have, and why mcfgthread provides `__cxa_atexit()`, although it may seem strange, duplicate, or redundant. Such order of destruction is deeply bundled into the `__cxa_finalize()` implementation, and if the Microsoft implementation doesn't work the standard way, it is completely impossible to work around that.



This also may be supported in existing threading models. Overflow is trivial to fix by waiting in a loop. (There are other reasons why OS support for absolute timeout is slightly better, but the price of this design decision makes it questionable. I plan to elaborate more on that on mingw ML, but I need to find time to do a bit of research first).


Well, if I use your own words, it would also be to 'make things more complicated than they are'. `ZwWaitForSingleObject()` is a public driver API, ripe, well documented [1], and guaranteed. So why not use that? Why stick to `WaitForSingleObject()`, which is nothing but a wrapper for the aforementioned syscall? Why do we have to maintain something that might continue to work on 9x or CE, but in reality, nobody ever cares about?


[1] 
https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-zwwaitforsingleobject


--
Best regards,
LIU Hao

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to