Re: [Python-Dev] Bugs in thread_nt.h
On Thursday 10 March 2011, Sturla Molden wrote: As for InterlockedCompareExchange et al., MSDN says this: The parameters for this function must be aligned on a 32-bit boundary; bit != byte Uli ** Domino Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932 ** Visit our website at http://www.dominolaser.com/ ** Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, weitergeleitet, veröffentlicht oder anderweitig benutzt werden. E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte Änderungen enthalten. Domino Laser GmbH ist für diese Folgen nicht verantwortlich. ** ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
On Wed, Mar 9, 2011 at 8:42 PM, Martin v. Löwis mar...@v.loewis.de wrote: As for the volatile marker - I believe the code is also correct without it, since the owned field is only accessed through initialization and Interlocked operations. Furthermore, if the code weren't correct, volatile would only be half the solution since it only restricts compiler-induced reorders. Some sort of CPU fence instruction would almost certainly be needed as well. Sturla wrote: releasing the time-slice by Sleep(0) for each iteration, it will certainly fail without a volatile qualifier. In general, your compiler will reload globally-accessible memory after a function call (or after a memory store). In fact, it can be quite a job to give them enough hints that they stop doing so! This behavior (which volatile aggravates) unfortunately makes it even tougher to find race conditions. In my experience, volatile should be avoided. I'd even bet money that some grumpy person has written a volatile considered harmful essay. I Am Not A memory-model expert so consult your local guru, p ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
On 3/10/2011 3:07 AM, Paul Du Bois wrote: volatile considered harmful http://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt -- Scott Dial sc...@scottdial.com scod...@cs.indiana.edu ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
This behavior (which volatile aggravates) unfortunately makes it even tougher to find race conditions. In my experience, volatile should be avoided. I'd even bet money that some grumpy person has written a volatile considered harmful essay. I guess all this advice doesn't really apply to this case, though. The Microsoft API declares the parameter as a volatile*, indicating that they consider it proper usage of the API to declare the storage volatile. So ISTM that we should comply regardless of whether volatile is considered morally wrong in the general case. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Den 10.03.2011 11:06, skrev Scott Dial: http://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt The important part here (forgive me for being a pedant) is that register allocation of the (1) 'owned' field is actually unwanted, and (2) Microsoft specify 'volatile' in calls to the Interlocked* API-functions. I am sorry for misreading 32 bits (4 bytes) as 32 bytes. That is obviously very different. If Microsoft's malloc is sufficient, why does MSDN tell us to use _aligned_malloc instead of malloc? The rest is a comment to the link and a bit OT: Their argument is that (1) volatile supresses optimization; (2) within a critical section, access to shared data is synchronized; and thus (3) volatile is critical there. That is true, to some extent. Volatile is not a memory barrier, nor a mutex. Volatile's main purpose is to prevent the compiler from storing a variable in a register. Volatile might be used incorrectly if we don't understand this. Obvious usecases for volatile are: - Implementation of a spinlock, where register allocation is detrimental. - A buffer that is filled from the outside with some DMA mechanism. - Real-time programs and games where order of execution and and timing is critical, so optimization must be supressed. Even though volatile is not needed for processing within a critical section, we still need the shared data to be re-loaded upon entering and er-written upon leaving. We can use a typecast to non-volatile within a critical section, and achieve both data consisitency and compiler optimization. OpenMP has a better mechanism for this, where a flush-operand (#pragma omp flush) will force a synchronization of shared data among threads (write and reload), and volatile is never needed for consistency of shared data. Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Martin v. Löwis: I guess all this advice doesn't really apply to this case, though. The Microsoft API declares the parameter as a volatile*, indicating that they consider it proper usage of the API to declare the storage volatile. The 'volatile' here is a modifier on the parameter and does not require a corresponding agreement in the variable declaration. It indicates that all access through the pointer inside the function will be with volatile semantics. As long as all functions that operate on the variable do so treating access as volatile then everything is fine. You should only need to declare the variable as volatile if there is other code that accesses it directly. If agreement was required then the compiler would print a warning. It is similar to declaring a function to take a const parameter: there is no need for the variable to also be const. Neil ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
On Thu, 10 Mar 2011 12:18:22 +0100 Sturla Molden stu...@molden.no wrote: Obvious usecases for volatile are: - Implementation of a spinlock, where register allocation is detrimental. - A buffer that is filled from the outside with some DMA mechanism. - Real-time programs and games where order of execution and and timing is critical, so optimization must be supressed. - variables mutated from signal handlers Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
I am sorry for misreading 32 bits (4 bytes) as 32 bytes. That is obviously very different. If Microsoft's malloc is sufficient, why does MSDN tell us to use _aligned_malloc instead of malloc? I don't know. Perhaps they assume that people may be using alternative malloc implementations, or (more likely) people in the platform group are unaware of the guarantees the compiler group makes (because of compliance with the C standard). Obvious usecases for volatile are: - Implementation of a spinlock, where register allocation is detrimental. Unfortunately, that's not true. It's not really possible to implement a spinlock in C if the processor can make changes to the write ordering. - A buffer that is filled from the outside with some DMA mechanism. - Real-time programs and games where order of execution and and timing is critical, so optimization must be supressed. Likewise. On many current processors, you have to use fencing instructions to make this kind of stuff work, something that the compiler will not normally generate (except through inline assembly). Even though volatile is not needed for processing within a critical section, we still need the shared data to be re-loaded upon entering and er-written upon leaving. Please see the code. Where exactly does such reloading/rewriting need to happen? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Am 10.03.11 07:55, schrieb Neil Hodgson: Martin v. Löwis: I guess all this advice doesn't really apply to this case, though. The Microsoft API declares the parameter as a volatile*, indicating that they consider it proper usage of the API to declare the storage volatile. The 'volatile' here is a modifier on the parameter and does not require a corresponding agreement in the variable declaration. I see. So then we don't need to make any changes. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
I guess all this advice doesn't really apply to this case, though. The Microsoft API declares the parameter as a volatile*, indicating that they consider it proper usage of the API to declare the storage volatile. So ISTM that we should comply regardless of whether volatile is considered morally wrong in the general case. Microsoft compiler implements Microsoft-specific behaviour for volatile variables, making them close to volatiles in Java: http://msdn.microsoft.com/en-us/library/12a04hfd(v=VS.100).aspx That may be the reason they use it in Interlocked* functions. As already been said, the thread_nt doesn't need any changes. Eugene ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
These issues are best put in the tracker so they don't get lost - especially at the moment with lots of regulars at pycon. It would also be good to know if there is an actual behaviour bug caused by this (ie, what problems can be observed which are caused by the current code?) Cheers, Mark On 10/03/2011 12:25 PM, Sturla Molden wrote: Atomic operations (InterlockedCompareExchange, et al.) are used on the field 'owned' in NRMUTEX. These methods require the memory to be aligned on 32-byte boundaries. They also require the volatile qualifer. Three small changes are therefore needed (see below). Regards, Sturla Molden typedef struct NRMUTEX { volatile LONG owned ; /* Bugfix: remember volatile */ DWORD thread_id ; HANDLE hevent ; } NRMUTEX, *PNRMUTEX; NRMUTEX AllocNonRecursiveMutex(void) { PNRMUTEX mutex = (PNRMUTEX)_aligned_malloc(sizeof(NRMUTEX),32) ; /* Bugfix: align to 32-bytes */ if (mutex !InitializeNonRecursiveMutex(mutex)) { free(mutex) ; mutex = NULL ; } return mutex ; } void FreeNonRecursiveMutex(PNRMUTEX mutex) { if (mutex) { DeleteNonRecursiveMutex(mutex) ; _aligned_free(mutex) ; /* Bugfix: align to 32-bytes */ } } ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/skippy.hammond%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Den 10.03.2011 03:02, skrev Mark Hammond: These issues are best put in the tracker so they don't get lost - especially at the moment with lots of regulars at pycon. Ok, sorry :-) It would also be good to know if there is an actual behaviour bug caused by this (ie, what problems can be observed which are caused by the current code?) None that I have observed, but this is required according to MSDN. Theoretically, an optimizing compiler could cache the 'owned' field if it's not declared volatile. It currently works because a wait on the lock is implemented with a WaitForSingleObject on a kernel event object when the waitfalg is set. If the wait mechanism is changed to a much less expensive user-space spinlock, just releasing the time-slice by Sleep(0) for each iteration, it will certainly fail without a volatile qualifier. As for InterlockedCompareExchange et al., MSDN says this: The parameters for this function must be aligned on a 32-bit boundary; otherwise, the function will behave unpredictably on multiprocessor x86 systems and any non-x86 systems. See _aligned_malloc. Well, it does not hurt to obey :-) Regards, Sturla ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Bugs in thread_nt.h
Am 09.03.11 20:25, schrieb Sturla Molden: These methods require the memory to be aligned on 32-byte boundaries. You misread the documentation - it's a 32-*bit* boundary that the LONG variable must be on. The malloc() call that is currently used trivially meets this requirement. As for the volatile marker - I believe the code is also correct without it, since the owned field is only accessed through initialization and Interlocked operations. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com