Re: Looking at reordering memory operations

Gil Tene Sat, 10 Mar 2018 15:36:07 -0800


On Saturday, March 10, 2018 at 1:18:40 PM UTC-8, John Hening wrote:
>
> Gil, thanks for your response. It is very helpful. 
>
> In your specific example above, there is actually no ordering question, 
> because your writeTask() operations doesn't actually observe the state 
> changed by connection.configueBlocking(false)
>
>
> I agree that my question wasn't correct. There is not 'ordering'. I meant 
> visibility. 
>


Visibility and ordering are related. Questions about the visibility of 
state (like that of the field "blocking") apply only to things that 
interact with that state. And when things interact with some state, the 
ordering in which changes to that state becomes visible (with relation to 
changes to other state, like e.g. the enqueing of an operation via 
Executor.execute(), being visible) has (or doesn't have) certain guarantees.

E.g. in the example discussed, with the synchronized blocks in place on 
both the writer and the reader of the field "blocking", we are guaranteed 
that the change of "blocking = false" is visible to the thread that 
executes writeTask() (if writeTask actually uses the value of "blocking" 
obtained within the synchronized block) before the request to execute 
write() is visible to that same thread...
 

>  Without the use of synchronized in isBlocking(), the use of synchronized 
> in configureBlocking() wouldn't make a difference.
>
> Yes, semi-synchronized doesn't work. So, I conclude that without a 
> synchronization the result of `blocking = false` could be invisible for 
> writeTask, am I right?
>

It is not a matter of being invisible. It's a field in a shared object, so 
all operations on it are eventually visible (to things that access it). 
What you can certainly say here is that without using synchronized blocks 
*on both ends* (both the writer and the reader), and without being replaced 
by some other ordering mechanism *on both ends*, your writeTask could 
observe a value of "blocking" that predates the modification of it in the 
thread that calls connection.configueBlocking().
 

> As your question about the possibility of "skipping" some write operations.
>
>
> By skipping I meant 'being invisible for observers'. For example, if one 
> thread t1 read any not-volatile-integer x then it is possible that t1 see 
> always the same value of x (though there is another thread t2 that modifies 
> x).
>

That (t1 always see the same value of x when x is modified elsewhere) is 
possible, e.g. in a tight loop reading x and nothing else. But that will 
only happen if no other ordering constructs force the visibility of 
modifications to x. E.g if thread t1 read some volatile field y that thread 
t2 modifies after modifying x, then thread t1 will observe the modified 
value of x in reads that occur after observing the modified value of y. In 
such a case, it won't "always see the same value of x".


>
>
> 1. It is interesting for me what about a such situation:
>
>    while(true) {
>         SocketChannel connection = serverSocketChannel.accept();
>         connection.configueBlocking(false);
>         Unsafe.storeFence();
>         executor.execute(() -> writeTask(connection));
>     }
>     void writeTask(SocketChannel s){
>         (***)
>         any_static_global_field = s.isBlocking();
>     }
>
> For my eye it should work but I have doubts. What does it mean storeFence? 
> Please flush it to the memory immediately! 
>

Unsafe.storeFence doesn't mean "flush...". It means "Ensures lack of 
reordering of stores before the fence with loads or stores after the 
fence." (that's literally what the Javadoc for it says).
 

> So, it will be visible before starting the executor thread. But, it seems 
> that, here, load fence is not necessary (***). Why? The blocking field must 
> be read from memory (there is no possibility that it is cached, because it 
> is read the first time by the executor thread). When it comes to CPU cache 
> it may be cached but cache is coherent = no problem). Moreover, there is no 
> need to ensure ordering here. So, loadFence is not necessary. Yes? 
>

No. At least not quite. For this specific sequence, you already have the 
ordering you want, but not for the reasons you think.

First, please put aside this notion that there is some memory, and some 
cache or store buffer, and some flushing going on. This ordering and 
visibility stuff has nothing to do with any of those potential 
implementation details. and trying to explain things in terms of those 
potential (and incomplete) implementation details mostly serves to confuse 
the issue. A tip I give people for thinking about this stuff is: Always 
think of the compiler as the culprit when it comes to reordering, and in 
that thinking, imagine the compiler being super-smart and super-mean. The 
compiler is allowed to create all sorts of evil, ingenious and 
pre-cognitive reorderings, cachings, and redundant or dead operation 
eliminations (including pre-caching of values it thinks you might want to 
read later, and using the pre-cached values if it turns out to be right) 
unless it is specifically told otherwise. And the compiler has no need for 
physical implementation things like memory, caches, and store buffers, and 
flushes to mess you up. Virtually all the bad things that can happen can be 
achieved that way, and many bad things can be made to happen this way that 
CPUs may never be able to cause on their own. Using the mental modeling of 
those physical implementation things is not only unneeded; it will often 
give you a false sense of coverage (like e.g. "there is no possibility that 
it would be cached, because it is read for the first time").

In itself, making sure that the stores and loads after the fence in the 
code above are not ordered with stores before the fence (all on the thread 
that makes the change to the "blocking" field state) does *nothing* to 
ensure that the read of the same state on the executor thread observes the 
modified state. E.g. what if the connection object was not a new object, 
but somehow recycled? And what if that same object was previously accessed 
by the same executor thread, and the thread examined the state of the 
"blocking" field in that previous incarnation?

However, in practice, the thing that does provide you with the proper 
ordering on the executor thread is the combination of the enqueuing 
operation in Excutor.execute() call and the dequeuing operation in the 
executor thread. The enqueuing operation will likely have a volatile store 
semantics (such that all stores that occur before it will be visible to 
readers of those stores before the enqueue operation is visible to them), 
and the dequeue operation likely has volatile load semantics (such that all 
loads occurring after the dequeue operation observe at least the values 
that are made visible before the enqueue was made visible to be read by the 
dequeue). With this enqueue/dequeue ordered in between you modification of 
"blocked" and the execution of writeTask(), you effectively have the load 
ordering you asked about. So you don't need an extra thing to provide the 
load-load ordering on the executor side. Note you don't need 
the Unsafe.storeFence() on the originating side, either.
 

>
> 2. 
> volatile int foo;
> ...
> foo = 1;
> foo = 2;
> foo = 3;
>
>
>
> It is very interesting. So, after JITed on x86 it can look like:
>
> mov &foo, 1
> sfence
> mov &foo, 2
> sfence
> mov &foo, 3
> sfence
>
>
> Are you sure that CPU can execute that as:
> mov &foo, 3
> sfence
>
>
The  CPU is allowed to execute it that way (how would you be able to tell 
the difference?). And the JIT compiler is allowed to transform it to that 
to begin with...


> ?
>
> I know that: 
>
> mov &foo, 1
> mov &foo, 2
> mov &foo, 3 
>
>
>
> x86-CPU can optimizied it legally. 
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> W dniu piątek, 9 marca 2018 23:20:37 UTC+1 użytkownik John Hening napisał:
>>
>>
>>     executor = Executors.newFixedThreadPool(16);
>>     while(true) {
>>         SocketChannel connection = serverSocketChannel.accept();
>>         connection.configueBlocking(false);
>>         executor.execute(() -> writeTask(connection)); 
>>     }
>>     void writeTask(SocketChannel s){
>>         s.isBlocking();
>>     }
>>
>>     public final SelectableChannel configureBlocking(boolean block) 
>> throws IOException
>>     {
>>         synchronized (regLock) {
>>             ...
>>             blocking = block;
>>         }
>>         return this;
>>     }
>>
>>
>>
>> We see the following situation: the main thread is setting 
>> connection.configueBlocking(false)
>>
>> and another thread (launched by executor) is reading that. So, it looks 
>> like a datarace.
>>
>> My question is:
>>
>> 1. Here 
>> configureBlocking
>>
>> is synchronized so it behaves as memory barrier. It means that code is 
>> ok- even if reading/writing to 
>> blocking
>>
>> field is not synchronized- reading/writing boolean is atomic.
>>
>> 2. What if 
>> configureBlocking
>>
>> wouldn't be synchronized? What in a such situation? I think that it would 
>> be necessary to emit a memory barrier because it is theoretically possible 
>> that setting blocking field could be reordered. 
>>
>> Am I right?
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Looking at reordering memory operations

Reply via email to