Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Ulrich Eckhardt
On Thursday 10 March 2011, Sturla Molden wrote:
 As for InterlockedCompareExchange et al., MSDN says this: The
 parameters for this function must be aligned on a 32-bit boundary;

bit != byte

Uli



**
Domino Laser GmbH, Fangdieckstraße 75a, 22547 Hamburg, Deutschland
Geschäftsführer: Thorsten Föcking, Amtsgericht Hamburg HR B62 932
**
Visit our website at http://www.dominolaser.com/
**
Diese E-Mail einschließlich sämtlicher Anhänge ist nur für den Adressaten 
bestimmt und kann vertrauliche Informationen enthalten. Bitte benachrichtigen 
Sie den Absender umgehend, falls Sie nicht der beabsichtigte Empfänger sein 
sollten. Die E-Mail ist in diesem Fall zu löschen und darf weder gelesen, 
weitergeleitet, veröffentlicht oder anderweitig benutzt werden.
E-Mails können durch Dritte gelesen werden und Viren sowie nichtautorisierte 
Änderungen enthalten. Domino Laser GmbH ist für diese Folgen nicht 
verantwortlich.
**


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Paul Du Bois
On Wed, Mar 9, 2011 at 8:42 PM, Martin v. Löwis mar...@v.loewis.de wrote:
 As for the volatile marker - I believe the code is also
 correct without it, since the owned field is only accessed
 through initialization and Interlocked operations.

Furthermore, if the code weren't correct, volatile would only be
half the solution since
it only restricts compiler-induced reorders. Some sort of CPU fence instruction
would almost certainly be needed as well.

Sturla wrote:
 releasing the time-slice by Sleep(0) for each iteration, it will certainly 
 fail without a volatile qualifier.

In general, your compiler will reload globally-accessible memory after
a function call (or
after a memory store). In fact, it can be quite a job to give them
enough hints that they stop doing so!
This behavior (which volatile aggravates) unfortunately makes it
even tougher to find race
conditions. In my experience, volatile should be avoided. I'd even bet
money that some grumpy
person has written a volatile considered harmful essay.

I Am Not A memory-model expert so consult your local guru,
p
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Scott Dial
On 3/10/2011 3:07 AM, Paul Du Bois wrote:
 volatile considered harmful

http://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt

-- 
Scott Dial
sc...@scottdial.com
scod...@cs.indiana.edu
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Martin v. Löwis

This behavior (which volatile aggravates) unfortunately makes it
even tougher to find race conditions. In my experience,

 volatile should be avoided. I'd even bet money that some grumpy

person has written a volatile considered harmful essay.


I guess all this advice doesn't really apply to this case, though.
The Microsoft API declares the parameter as a volatile*, indicating
that they consider it proper usage of the API to declare the storage
volatile. So ISTM that we should comply regardless of whether volatile
is considered morally wrong in the general case.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Sturla Molden

Den 10.03.2011 11:06, skrev Scott Dial:

http://www.kernel.org/doc/Documentation/volatile-considered-harmful.txt





The important part here (forgive me for being a pedant) is that register 
allocation of the (1) 'owned' field is actually unwanted, and (2) 
Microsoft specify 'volatile' in calls to the Interlocked* API-functions.


I am sorry for misreading 32 bits (4 bytes) as 32 bytes. That is 
obviously very different. If Microsoft's malloc is sufficient, why does 
MSDN tell us to use _aligned_malloc instead of malloc?



The rest is a comment to the link and a bit OT:

Their argument is that (1) volatile supresses optimization; (2) within a 
critical section, access to shared data is synchronized; and thus (3) 
volatile is critical there.


That is true, to some extent.

Volatile is not a memory barrier, nor a mutex. Volatile's main purpose 
is to prevent the compiler from storing  a variable in a register. 
Volatile might be used incorrectly if we don't understand this.


Obvious usecases for volatile are:

- Implementation of a spinlock, where register allocation is detrimental.
- A buffer that is filled from the outside with some DMA mechanism.
- Real-time programs and games where order of execution and and timing 
is critical, so optimization must be supressed.


Even though volatile is not needed for processing within a critical 
section, we still need the shared data to be re-loaded upon entering and 
er-written upon leaving. We can use a typecast to non-volatile within a 
critical section, and achieve both data consisitency and compiler 
optimization. OpenMP has a better mechanism for this, where a 
flush-operand (#pragma omp flush) will force a synchronization of shared 
data among threads (write and reload), and volatile is never needed for 
consistency of shared data.


Sturla






___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Neil Hodgson
Martin v. Löwis:

 I guess all this advice doesn't really apply to this case, though.
 The Microsoft API declares the parameter as a volatile*, indicating
 that they consider it proper usage of the API to declare the storage
 volatile.

   The 'volatile' here is a modifier on the parameter and does not
require a corresponding agreement in the variable declaration. It
indicates that all access through the pointer inside the function will
be with volatile semantics. As long as all functions that operate on
the variable do so treating access as volatile then everything is
fine. You should only need to declare the variable as volatile if
there is other code that accesses it directly.

   If agreement was required then the compiler would print a warning.

   It is similar to declaring a function to take a const parameter:
there is no need for the variable to also be const.

   Neil
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Antoine Pitrou
On Thu, 10 Mar 2011 12:18:22 +0100
Sturla Molden stu...@molden.no wrote:
 
 Obvious usecases for volatile are:
 
 - Implementation of a spinlock, where register allocation is detrimental.
 - A buffer that is filled from the outside with some DMA mechanism.
 - Real-time programs and games where order of execution and and timing 
 is critical, so optimization must be supressed.

- variables mutated from signal handlers

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Martin v. Löwis

I am sorry for misreading 32 bits (4 bytes) as 32 bytes. That is
obviously very different. If Microsoft's malloc is sufficient, why does
MSDN tell us to use _aligned_malloc instead of malloc?


I don't know. Perhaps they assume that people may be using alternative
malloc implementations, or (more likely) people in the platform group
are unaware of the guarantees the compiler group makes (because of
compliance with the C standard).


Obvious usecases for volatile are:

- Implementation of a spinlock, where register allocation is detrimental.


Unfortunately, that's not true. It's not really possible to implement
a spinlock in C if the processor can make changes to the write ordering.


- A buffer that is filled from the outside with some DMA mechanism.
- Real-time programs and games where order of execution and and timing
is critical, so optimization must be supressed.


Likewise. On many current processors, you have to use fencing 
instructions to make this kind of stuff work, something that the 
compiler will not normally generate (except through inline assembly).



Even though volatile is not needed for processing within a critical
section, we still need the shared data to be re-loaded upon entering and
er-written upon leaving.


Please see the code. Where exactly does such reloading/rewriting need to 
happen?


Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Martin v. Löwis

Am 10.03.11 07:55, schrieb Neil Hodgson:

Martin v. Löwis:


I guess all this advice doesn't really apply to this case, though.
The Microsoft API declares the parameter as a volatile*, indicating
that they consider it proper usage of the API to declare the storage
volatile.


The 'volatile' here is a modifier on the parameter and does not
require a corresponding agreement in the variable declaration.


I see. So then we don't need to make any changes.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-10 Thread Eugene Toder
 I guess all this advice doesn't really apply to this case, though.
 The Microsoft API declares the parameter as a volatile*, indicating
 that they consider it proper usage of the API to declare the storage
 volatile. So ISTM that we should comply regardless of whether volatile
 is considered morally wrong in the general case.

Microsoft compiler implements Microsoft-specific behaviour for
volatile variables, making them close to volatiles in Java:
http://msdn.microsoft.com/en-us/library/12a04hfd(v=VS.100).aspx
That may be the reason they use it in Interlocked* functions.

As already been said, the thread_nt doesn't need any changes.

Eugene
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-09 Thread Mark Hammond
These issues are best put in the tracker so they don't get lost - 
especially at the moment with lots of regulars at pycon.


It would also be good to know if there is an actual behaviour bug caused 
by this (ie, what problems can be observed which are caused by the 
current code?)


Cheers,

Mark

On 10/03/2011 12:25 PM, Sturla Molden wrote:


Atomic operations (InterlockedCompareExchange, et al.) are used on the
field 'owned' in NRMUTEX. These methods require the memory to be aligned
on 32-byte boundaries. They also require the volatile qualifer. Three
small changes are therefore needed (see below).


Regards,
Sturla Molden





typedef struct NRMUTEX {
volatile LONG owned ; /* Bugfix: remember volatile */
DWORD thread_id ;
HANDLE hevent ;
} NRMUTEX, *PNRMUTEX;


NRMUTEX
AllocNonRecursiveMutex(void)
{
PNRMUTEX mutex = (PNRMUTEX)_aligned_malloc(sizeof(NRMUTEX),32) ; /*
Bugfix: align to 32-bytes */
if (mutex  !InitializeNonRecursiveMutex(mutex))
{
free(mutex) ;
mutex = NULL ;
}
return mutex ;
}

void
FreeNonRecursiveMutex(PNRMUTEX mutex)
{
if (mutex)
{
DeleteNonRecursiveMutex(mutex) ;
_aligned_free(mutex) ; /* Bugfix: align to 32-bytes */
}
}







___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/skippy.hammond%40gmail.com



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-09 Thread Sturla Molden

Den 10.03.2011 03:02, skrev Mark Hammond:
These issues are best put in the tracker so they don't get lost - 
especially at the moment with lots of regulars at pycon.



 Ok, sorry :-)


It would also be good to know if there is an actual behaviour bug 
caused by this (ie, what problems can be observed which are caused by 
the current code?)


None that I have observed, but this is required according to MSDN.

Theoretically, an optimizing compiler could cache the 'owned' field if 
it's not declared volatile. It currently works because a wait on the 
lock is implemented with a WaitForSingleObject on a kernel event object 
when the waitfalg is set. If the wait mechanism is changed to a much 
less expensive user-space spinlock, just releasing the time-slice by 
Sleep(0) for each iteration, it will certainly fail without a volatile 
qualifier.


As for InterlockedCompareExchange et al., MSDN says this: The 
parameters for this function must be aligned on a 32-bit boundary; 
otherwise, the function will behave unpredictably on multiprocessor x86 
systems and any non-x86 systems. See _aligned_malloc.


Well, it does not hurt to obey :-)

Regards,
Sturla



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bugs in thread_nt.h

2011-03-09 Thread Martin v. Löwis

Am 09.03.11 20:25, schrieb Sturla Molden:

These methods require the memory to be aligned
on 32-byte boundaries.


You misread the documentation - it's a 32-*bit*
boundary that the LONG variable must be on. The malloc()
call that is currently used trivially meets this requirement.

As for the volatile marker - I believe the code is also
correct without it, since the owned field is only accessed
through initialization and Interlocked operations.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com