Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-23 Thread Peter Rabbitson
On Mon, Jan 23, 2023 at 10:54 PM Ian Lance Taylor  wrote:

> There is a model of memory behavior in which programs use pure write
> memory barriers and pure read memory barriers.  However, Go does not
> use that model.  Go uses a different model, in which writers and
> readers are expected to cooperate using atomic loads and stores.
>

Understood. Thank you (and Keith) for stating this enough times that it
clicked for me: it's not that what I am trying to do is impossible, but
rather there is nothing in go that exposes these types of barriers to
userspace by design. Having this mentioned explicitly in the memory model
doc would have helped me build the correct frame of reference early on.


> As such, Go does not provide a pure write memory barrier.  I promise.
> (As I noted earlier, Go does permit calling into C, and therefore
> permits you to do anything that C permits you to do.  It's worth
> noting that because C code can do anything, a call into a C function
> is a full compiler (but not hardware) memory barrier.)
>

The CGO compile-time ergonomics overhead is a bit of a concern in my case.
Does calling into non-inlineable(?) go-asm functions achieve the same
compiler-ordering-barrier? I.e. is the following kosher / reasonably
forward-compatible: https://stackoverflow.com/q/42911884 ?

Thank you!

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTSKdom%2Bvj7eOF3s4GLz-cT35vobyoCvtg2qL%3DfPPD1ESRw%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-23 Thread Ian Lance Taylor
On Sun, Jan 22, 2023 at 11:36 PM Peter Rabbitson  wrote:
>
> That's a fair point. I avoided going into details not to risk tickling latent 
> design-urges of the readers ;)
>
> Setup:
> - Single-writer multiple-readers scenario
> - Writer is always exclusively single threaded, no concurrency whatsoever. 
> Only possible sources of operation reordering are: a) the discrete CPU 
> execution pipeline b) the compiler itself c) OS preemption/ SMP migration.
> - Communication is over a single massive mmaped file-backed region.
> - Exploits the fact that on Linux the VFS cache in front of the named file 
> and the mmaped "window" within every process are all literally the same 
> kernel memory.
> - Communication is strictly one-way: writer does not know nor care about the 
> amount of readers, what are they looking at, etc.
> - Readers are expected to accommodate above, be prepared to look at stale 
> data, etc
> - For simplicity assume that the file/mmap is of unreachable size ( say 1PiB 
> ) and that additions are all appends, with no garbage collection - stale data 
> which is not referenced by anything just sticks around indefinitely.
>
> Writer pseudocode ( always only one thread, has exclusive write access )
> 1. Read current positioning from mmap offset 0 - no locks needed since I am 
> the one who modified things last
> 2. Do the payload writes, several GiB append within the unused portion of the 
> mmap
> 3. Writeout necessary indexes and pointers to the contents of 2, another 
> append this time several KiB
> 4. {{ MY QUESTION }} Emit a SFENCE/LOCK(amd64) or DMB(arm64) to ensure 
> sequencing consistency and that all CPUs see the same state of the kernel 
> memory backing the mmap
> 5. Write a single uint64 at mmap offset 0, pointing to the new "state of the 
> world" written during 3. which in turn points at various pieces of data 
> written in 2.
> 6. goto 1
>
> Readers pseudocode ( many readers, various implementation languages not just 
> go,  utterly uncoordinated, happy to see "old transaction", but expect 5 => 3 
> => 2 to be always consistent )
> 1. Read current positioning from mmap offset 0 - no locks as I am equally 
> happy to see the new or old uint64. I do assume that a word-sized read is 
> always atomic, and I won't see a "torn" u64
> 2. Walk around either the new or old network of pointers. The barrier 4. in 
> the writer ensures I can't see a pointer to something that doesn't yet exist.
>
> The end.
>
>>
>> There is no Go equivalent to a pure write memory barrier.
>
>
> Ian, I recognize I am speaking to one of the language creators and that you 
> know *way* more than me about this subject. Nevertheless I find it really 
> hard to accept your statement. There got to be a set of constructs that have 
> the desired side-effects described in 4 above. I also still maintain that the 
> memory model should discuss this, in the compilation guarantees section at 
> the bottom. After all a standalone go program is nothing more than a list of 
> instructions for a CPU mediated by an OS. The precise sequencing of these 
> instructions in special circumstances should be clear/controllable.
>
> I guess I will spend some time to learn how to poke around the generated 
> assembly tomorrow...

There is a model of memory behavior in which programs use pure write
memory barriers and pure read memory barriers.  However, Go does not
use that model.  Go uses a different model, in which writers and
readers are expected to cooperate using atomic loads and stores.

As such, Go does not provide a pure write memory barrier.  I promise.
(As I noted earlier, Go does permit calling into C, and therefore
permits you to do anything that C permits you to do.  It's worth
noting that because C code can do anything, a call into a C function
is a full compiler (but not hardware) memory barrier.)

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAOyqgcUcYObG6PgX%3DwXpLLfK-TpE7hacER7RD58N1xw8mvQ10g%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-23 Thread 'Keith Randall' via golang-nuts
Just to be clear, to get what you want just write data normally for steps 
1-4 and use an atomic store for step 5. That guarantees that other 
processes will see steps 1-4 all done if they see the write from step 
5. (But you *do* have to use an atomic read appropriate to your language to 
do reader step 1. Just a standard read will not do.)

Go does not provide separate "pure" memory barriers. The compiler and/or 
runtime include them when needed to ensure the required semantics for 
locks, atomic operations, etc.

On Monday, January 23, 2023 at 2:13:20 AM UTC-8 Gergely Födémesi wrote:

> On 1/23/23, Peter Rabbitson  wrote:
> ...
> > I guess I will spend some time to learn how to poke around the generated
> > assembly tomorrow...
>
> If I understand correctly you are trying to force your model of the
> world into the Go memory model. The models are different, so this
> won't work.
>
> Please also note that your model of current execution complexes is
> probably valid today, but it could change anytime. The Go memory model
> is differently restricting to accommodate for that future.
>
> Of course you can implement what you want using any tool available,
> but the correct execution can't be ensured by the Go memory model if
> you don't build on that.
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/3272b1ce-e976-4c0f-9813-985cbd5f364an%40googlegroups.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-23 Thread fgergo
On 1/23/23, Peter Rabbitson  wrote:
...
> I guess I will spend some time to learn how to poke around the generated
> assembly tomorrow...

If I understand correctly you are trying to force your model of the
world into the Go memory model. The models are different, so this
won't work.

Please also note that your model of current execution complexes is
probably valid today, but it could change anytime. The Go memory model
is differently restricting to accommodate for that future.

Of course you can implement what you want using any tool available,
but the correct execution can't be ensured by the Go memory model if
you don't build on that.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BctqrpxFc0htmyO3ApMtLWMKdyig-Eoay0x%2BnU1sO8Lmmw2hA%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Peter Rabbitson
On Mon, Jan 23, 2023 at 7:04 AM Ian Lance Taylor  wrote:

> Memory ordering only makes sense in terms of two different execution
> threads using shared memory.  In order to answer your question
> precisely, you need to tell us what the process reading the memory
> region is going to do to access the memory.  In order to know how to
> write the memory, it's necessary to know how the memory is going to be
> read.
>

That's a fair point. I avoided going into details not to risk tickling
latent design-urges of the readers ;)

Setup:
- Single-writer multiple-readers scenario
- Writer is always exclusively single threaded, no concurrency whatsoever.
Only possible sources of operation reordering are: a) the discrete CPU
execution pipeline b) the compiler itself c) OS preemption/ SMP migration.
- Communication is over a single massive mmaped file-backed region.
- Exploits the fact that on Linux the VFS cache in front of the named file
and the mmaped "window" within every process are all literally the same
kernel memory.
- Communication is strictly one-way: writer does not know nor care about
the amount of readers, what are they looking at, etc.
- Readers are expected to accommodate above, be prepared to look at stale
data, etc
- For simplicity assume that the file/mmap is of unreachable size ( say
1PiB ) and that additions are all appends, with no garbage collection -
stale data which is not referenced by anything just sticks around
indefinitely.

Writer pseudocode ( always only one thread, *has exclusive write access* )
1. Read current positioning from mmap offset 0 - no locks needed since I am
the one who modified things last
2. Do the payload writes, several GiB append within the unused portion of
the mmap
3. Writeout necessary indexes and pointers to the contents of 2, another
append this time several KiB
4. {{ MY QUESTION }} Emit a SFENCE
/LOCK
(amd64) or DMB
(arm64)
to ensure sequencing consistency and that all CPUs see the same state of
the kernel memory backing the mmap
5. Write a single uint64 at mmap offset 0, pointing to the new "state of
the world" written during 3. which in turn points at various pieces of data
written in 2.
6. goto 1

Readers pseudocode ( many readers, various implementation languages not
just go,  utterly uncoordinated, happy to see "old transaction", but *expect
5 => 3 => 2 to be always consistent* )
1. Read current positioning from mmap offset 0 - no locks as I am equally
happy to see the new or old uint64. I do assume that a word-sized read is
always atomic, and I won't see a "torn" u64
2. Walk around either the new or old network of pointers. The barrier 4. in
the writer ensures I can't see a pointer to something that doesn't yet
exist.

The end.


> There is no Go equivalent to a pure write memory barrier.


Ian, I recognize I am speaking to one of the language creators and that you
know *way* more than me about this subject. Nevertheless I find it really
hard to accept your statement. There got to be a set of constructs that
have the desired side-effects described in 4 above. I also still maintain
that the memory model should discuss this, in the compilation guarantees
section at the bottom. After all a standalone go program is nothing more
than a list of instructions for a CPU mediated by an OS. The precise
sequencing of these instructions in special circumstances should be
clear/controllable.

I guess I will spend some time to learn how to poke around the generated
assembly tomorrow...

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTSLfv8wYXZsE8Hf4Cv6iuOKRjmU6ZR9VYVVQaT52Fcgn3w%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Ian Lance Taylor
On Sun, Jan 22, 2023 at 9:12 AM Peter Rabbitson  wrote:
>
> On Sun, Jan 22, 2023 at 5:49 PM Ian Lance Taylor  wrote:
>>
>> On Sat, Jan 21, 2023, 11:12 PM Peter Rabbitson (ribasushi) 
>>  wrote:
>>>
>>> This question is focused exclusively on the writer side.
>>
>>
>> Perhaps I misunderstand, but it doesn't make sense to ask a question about 
>> the memory model only about one side or the other.  The memory model is 
>> about communication between two goroutines.  It has very little to say about 
>> the behavior of a single goroutine.
>
>
> I might be using the wrong term then, although a lot of text in 
> https://go.dev/ref/mem is relevant, it just does not answer my very specific 
> question. Let me try from a different angle:
>
> I want to write a single-threaded program in go which once compiled has a 
> certain behavior from the point of view of the OS, . More specifically this 
> program, from the Linux OS point of view should in very strict order:
> 1. grab a memory region
> 2. write an arbitrary amount of data to the region's end
> 3. write some more data to the region start
>
> By definition 1) will happen first ( you got to grab in order to write ), but 
> it is also critical that the program does all of 2), before it starts doing 
> 3).
>
> Modern compilers are wicked smart, and often do emit assembly that would have 
> some of 3) interleaved with 2). Moreover, due to kernel thread preemption 2) 
> and 3) could execute simultaneously on different physical CPUs, requiring a 
> NUMA-global synchronization signal to prevent this ( I believe it is LOCK on 
> x86 ). I am trying to understand how to induce all of this ordering from 
> within a go program in the most lightweight manner possible ( refer to the 
> problem statement and attempts starting on line 38 at 
> https://go.dev/play/p/ZXMg_Qq3ygF )

Memory ordering only makes sense in terms of two different execution
threads using shared memory.  In order to answer your question
precisely, you need to tell us what the process reading the memory
region is going to do to access the memory.  In order to know how to
write the memory, it's necessary to know how the memory is going to be
read.

That said, it's fairly likely that if you use an atomic store for the
very first memory write in step 3 that the right thing will happen.
On the Go side, that will ensure that all memory operations before
that memory write have at least been executed.  And if the reader does
an atomic read, it should ensure that when the reader sees that memory
write, it will also see all the earlier memory writes.


>
> If I were in C-land I believe would use something like:
>
> #include 
>
> void wmb(void);
>
>
> The question is - what is the go equivalent.

There is no Go equivalent to a pure write memory barrier.  Of course
you could use cgo to call a C function that does exactly what you
want.

Ian

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAOyqgcWGteeF9HqyygEmXUiGW3KYxWDQpaP5dQJBkNiy4D1s%2BQ%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread 'David Klempner' via golang-nuts
On Mon, Jan 23, 2023 at 11:38 AM Robert Engels 
wrote:

> The atomic functions force a memory barrier when atomically in conjunction
> with the atomic read of the same value.
>
> You could use CGO to call a C function to do what you desire - but it
> shouldn’t be necessary.
>

C doesn't solve this problem, unless your "C function" is something like an
"asm volatile". C has the exact same underlying issue as Go here -- the
language does not define interprocess atomics, because doing so is hard.

In practice you can probably do the equivalent of a sequentially consistent
store on the writer side and a sequentially consistent read on the other,
and things will likely work.

With that said, that doesn't strictly work even for two C programs. As a
concrete example, if you're implementing sequentially consistent atomics on
x86, while read-modify-write instructions don't need any extra fencing, you
have to choose whether to make reads expensive (by adding a fence on reads)
or writes expensive (by adding a fence on writes). While in practice every
implementation chooses to make the writes expensive, that is an
implementation decision and nothing stops someone from making a
reads-expensive-writes-cheap compiler.

If a "writes cheap compiler" program does an atomic write over shared
memory to a "reads cheap compiler" program, there won't be any fences on
either side and you won't get sequential consistent synchronization.
(That's fine for a simple producer-consumer queue as in this example which
only requires release-acquire semantics, but fancier algorithms that rely
on sequential consistency will be quite unhappy.)


>
> Not sure what else I can tell you.
>
> On Jan 22, 2023, at 8:12 PM, Peter Rabbitson  wrote:
>
> 
> On Mon, Jan 23, 2023 at 12:42 AM robert engels 
> wrote:
>
>> Write data to memory mapped file/shared memory. Keep track of last
>> written byte as new_length;
>>
>> Use atomic.StoreUint64(pointer to header.length, new_length);
>>
>>
> This does not answer the question I posed, which boils down to:
>
> How does one insert the equivalent of smp_wmb() /
> asm volatile("" ::: "memory") into a go program.
>
> For instance is any of these an answer?
> https://groups.google.com/g/golang-nuts/c/tnr0T_7tyDk/m/9T2BOvCkAQAJ
>
>
>> readers read ...
>>
>
> Please don't focus on the reader ;)
>
>
>> This assumes you are always appending ,,, then it is much more
>> complicated ... all readers have consumed the data before the writer reuses
>> it.
>>
>
> Yes, it is much more complicated :) I am making a note to post the result
> back to this thread in a few weeks when it is readable enough.
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/5F73A3C7-32C3-42A1-91A2-E1A0714FAEA5%40ix.netcom.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMbN1s5SAtWzgajE7ZCN_z-h2ABkQ5GWgx5%3DX7ehVQo-sOy82g%40mail.gmail.com.


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Robert Engels
The atomic functions force a memory barrier when atomically in conjunction with 
the atomic read of the same value. 

You could use CGO to call a C function to do what you desire - but it shouldn’t 
be necessary. 

Not sure what else I can tell you. 

> On Jan 22, 2023, at 8:12 PM, Peter Rabbitson  wrote:
> 
> 
>> On Mon, Jan 23, 2023 at 12:42 AM robert engels  wrote:
> 
>> Write data to memory mapped file/shared memory. Keep track of last written 
>> byte as new_length;
>> 
>> Use atomic.StoreUint64(pointer to header.length, new_length);
>> 
> 
> This does not answer the question I posed, which boils down to:
> 
> How does one insert the equivalent of smp_wmb() / asm volatile("" ::: 
> "memory") into a go program.
> 
> For instance is any of these an answer? 
> https://groups.google.com/g/golang-nuts/c/tnr0T_7tyDk/m/9T2BOvCkAQAJ
>  
>> readers read ...
> 
> Please don't focus on the reader ;) 
>  
>> This assumes you are always appending ,,, then it is much more complicated 
>> ... all readers have consumed the data before the writer reuses it.
> 
> Yes, it is much more complicated :) I am making a note to post the result 
> back to this thread in a few weeks when it is readable enough.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/5F73A3C7-32C3-42A1-91A2-E1A0714FAEA5%40ix.netcom.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Peter Rabbitson
On Mon, Jan 23, 2023 at 12:42 AM robert engels 
wrote:

> Write data to memory mapped file/shared memory. Keep track of last written
> byte as new_length;
>
> Use atomic.StoreUint64(pointer to header.length, new_length);
>
>
This does not answer the question I posed, which boils down to:

How does one insert the equivalent of smp_wmb() /
asm volatile("" ::: "memory") into a go program.

For instance is any of these an answer?
https://groups.google.com/g/golang-nuts/c/tnr0T_7tyDk/m/9T2BOvCkAQAJ


> readers read ...
>

Please don't focus on the reader ;)


> This assumes you are always appending ,,, then it is much more complicated
> ... all readers have consumed the data before the writer reuses it.
>

Yes, it is much more complicated :) I am making a note to post the result
back to this thread in a few weeks when it is readable enough.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTS%2BLF%3D9%3DsyiOyZSsxpdcEuKonGN14HEdS89OQNJ_%2BJTCUQ%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread robert engels
Write data to memory mapped file/shared memory. Keep track of last written byte 
as new_length;

Use atomic.StoreUint64(pointer to header.length, new_length);

readers read header.length atomically to determine the last valid byte (using 
whatever facilities their language has).

A reader then knows that bytes up to header.length are valid to consume.

This assumes you are always appending to the buffer - never reusing the earlier 
buffer space. If you desire to do this, then it is much more complicated as you 
need to determine that all readers have consumed the data before the writer 
reuses it.

The above must work in order for Go to have a happens before relationship with 
the atomics - all writes must be visible to a reader that see the updated value 
in the header.


> On Jan 22, 2023, at 12:53 PM, Peter Rabbitson  wrote:
> 
> 
> 
> On Sun, Jan 22, 2023 at 7:39 PM robert engels  > wrote:
> The atomic store will force a memory barrier - as long as the reader (in the 
> other process) atomically reads the “new value”, all other writes prior will 
> also be visible.
> 
> Could you translate this to specific go code? What would constitute what you 
> called "the atomic store" in the playground example I gave?  
>  
> BUT you can still have an inter-process race condition if you are updating 
> the same memory mapped file regions - and you need an OS mutex to protect 
> against this
> 
> Correct. This particular system is multiple-reader single-threaded-writer, 
> enforced by a Fcntl POSIX advisory lock. Therefore as long as I make the 
> specific writer consistent - I am done.
>  
> You can look at projects like https://github.com/OpenHFT/Chronicle-Queue 
>  for ideas.
> 
> Still, large-scale shared memory systems are usually not required. I would 
> use a highly efficient message system like Nats.io  and not 
> reinvent the wheel. Messaging systems are also far more flexible.
> 
> 
> Nod, the example you linked is vaguely in line with what I want. You are also 
> correct that reinventing a wheel is bad form, and is to be avoided at all 
> costs. Yet the latency sensitivity of the particular IPC unfortunately does 
> call for an even rounder wheel. My problem isn't about "what to do" nor "is 
> there another way", but rather "how do I do this from within the confines of 
> go". 
>  
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAMrvTS%2BHqsqCOMay3c8D5LuTwcmtuZQJY7gs8Rw5rXBLiYwErg%40mail.gmail.com
>  
> .

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/08AA8225-8F65-4297-AAB8-9FDA888C674B%40ix.netcom.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Peter Rabbitson
On Sun, Jan 22, 2023 at 7:39 PM robert engels  wrote:

> The atomic store will force a memory barrier - as long as the reader (in
> the other process) atomically reads the “new value”, all other writes prior
> will also be visible.
>

Could you translate this to specific go code? What would constitute what
you called "the atomic store" in the playground example I gave?


> BUT you can still have an inter-process race condition if you are updating
> the same memory mapped file regions - and you need an OS mutex to protect
> against this
>

Correct. This particular system is multiple-reader single-threaded-writer,
enforced by a Fcntl POSIX advisory lock. Therefore as long as I make the
specific writer consistent - I am done.


> You can look at projects like https://github.com/OpenHFT/Chronicle-Queue for
> ideas.
>
> Still, large-scale shared memory systems are usually not required. I would
> use a highly efficient message system like Nats.io and not reinvent the
> wheel. Messaging systems are also far more flexible.
>
>
Nod, the example you linked is vaguely in line with what I want. You are
also correct that reinventing a wheel is bad form, and is to be avoided at
all costs. Yet the latency sensitivity of the particular IPC unfortunately
does call for an even rounder wheel. My problem isn't about "what to do"
nor "is there another way", but rather "how do I do this from within the
confines of go".

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTS%2BHqsqCOMay3c8D5LuTwcmtuZQJY7gs8Rw5rXBLiYwErg%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread robert engels
The atomic store will force a memory barrier - as long as the reader (in the 
other process) atomically reads the “new value”, all other writes prior will 
also be visible.

BUT you can still have an inter-process race condition if you are updating the 
same memory mapped file regions - and you need an OS mutex to protect against 
this, or use other advanced append/sequence number techniques.

You can look at projects like https://github.com/OpenHFT/Chronicle-Queue 
 for ideas.

Still, large-scale shared memory systems are usually not required. I would use 
a highly efficient message system like Nats.io  and not 
reinvent the wheel. Messaging systems are also far more flexible.


> On Jan 22, 2023, at 11:11 AM, Peter Rabbitson  wrote:
> 
> 
> 
> On Sun, Jan 22, 2023 at 5:49 PM Ian Lance Taylor  > wrote:
> On Sat, Jan 21, 2023, 11:12 PM Peter Rabbitson (ribasushi) 
> mailto:ribasu...@gmail.com>> wrote:
> This question is focused exclusively on the writer side.
> 
> Perhaps I misunderstand, but it doesn't make sense to ask a question about 
> the memory model only about one side or the other.  The memory model is about 
> communication between two goroutines.  It has very little to say about the 
> behavior of a single goroutine.
> 
> I might be using the wrong term then, although a lot of text in 
> https://go.dev/ref/mem  is relevant, it just does not 
> answer my very specific question. Let me try from a different angle:
> 
> I want to write a single-threaded program in go which once compiled has a 
> certain behavior from the point of view of the OS, . More specifically this 
> program, from the Linux OS point of view should in very strict order:
> 1. grab a memory region
> 2. write an arbitrary amount of data to the region's end
> 3. write some more data to the region start
> 
> By definition 1) will happen first ( you got to grab in order to write ), but 
> it is also critical that the program does all of 2), before it starts doing 
> 3).
> 
> Modern compilers are wicked smart, and often do emit assembly that would have 
> some of 3) interleaved with 2). Moreover, due to kernel thread preemption 2) 
> and 3) could execute simultaneously on different physical CPUs, requiring a 
> NUMA-global synchronization signal to prevent this ( I believe it is LOCK on 
> x86 ). I am trying to understand how to induce all of this ordering from 
> within a go program in the most lightweight manner possible ( refer to the 
> problem statement and attempts starting on line 38 at 
> https://go.dev/play/p/ZXMg_Qq3ygF  )
> 
> 
> If I were in C-land I believe would use something like:
> 
> #include 
> void wmb(void);
> 
> The question is - what is the go equivalent.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to golang-nuts+unsubscr...@googlegroups.com 
> .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/golang-nuts/CAMrvTSL%2B5XO%3Dz6815RN6sNxLJsce%2BJsyRYm3zq85NiA6L9O_cA%40mail.gmail.com
>  
> .

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/5397AB07-C4E5-45B4-A3B4-785DEC7EEDB6%40ix.netcom.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Peter Rabbitson
On Sun, Jan 22, 2023 at 5:49 PM Ian Lance Taylor  wrote:

> On Sat, Jan 21, 2023, 11:12 PM Peter Rabbitson (ribasushi) <
> ribasu...@gmail.com> wrote:
>
>> This question is focused exclusively on the writer side.
>
>
> Perhaps I misunderstand, but it doesn't make sense to ask a question about
> the memory model only about one side or the other.  The memory model is
> about communication between two goroutines.  It has very little to say
> about the behavior of a single goroutine.
>

I might be using the wrong term then, although a lot of text in
https://go.dev/ref/mem is relevant, it just does not answer my very
specific question. Let me try from a different angle:

I want to write a single-threaded program in go which once compiled has a
certain behavior from the point of view of the OS, . More specifically this
program, from the Linux OS point of view should in very strict order:
1. grab a memory region
2. write an arbitrary amount of data to the region's end
3. write some more data to the region start

By definition 1) will happen first ( you got to grab in order to write ),
but it is also critical that the program does all of 2), before it starts
doing 3).

Modern compilers are wicked smart, and often do emit assembly that would
have some of 3) interleaved with 2). Moreover, due to kernel thread
preemption 2) and 3) could execute simultaneously on different physical
CPUs, requiring a NUMA-global synchronization signal to prevent this ( I
believe it is LOCK on x86 ). I am trying to understand how to induce all of
this ordering from within a go program in the most lightweight manner
possible ( refer to the problem statement and attempts starting on line 38
at https://go.dev/play/p/ZXMg_Qq3ygF )


If I were in C-land I believe would use something like:

#include 

void wmb(void);

The question is - what is the go equivalent.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTSL%2B5XO%3Dz6815RN6sNxLJsce%2BJsyRYm3zq85NiA6L9O_cA%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-22 Thread Ian Lance Taylor
On Sat, Jan 21, 2023, 11:12 PM Peter Rabbitson (ribasushi) <
ribasu...@gmail.com> wrote:

> This question is focused exclusively on the writer side.


Perhaps I misunderstand, but it doesn't make sense to ask a question about
the memory model only about one side or the other.  The memory model is
about communication between two goroutines.  It has very little to say
about the behavior of a single goroutine.

Ian


So are you saying that this will also work (based on
> https://go.dev/play/p/ZXMg_Qq3ygF )
> mmapBufRaw[fSize-1] = 255// W1
> (*mmapBufAtomic.Load())[0] = 42  // W2
>
> How about this, would that work as a "everything before the atomic. has to
> appear as a CPU instruction 1st ?
> mmapBufRaw[fSize-1] = 255   // W1
> atomic.LoadInt64(randomVal) // any atomic access acts as barrier
> mmapBufRaw[0] = 42  // W2
>
> This is the exact mechanism I am trying to understand - what is the
> minimum that golang needs to guarantee "as written" order synchronization,
> within a specific single goroutine.
>
> On Sunday, January 22, 2023 at 2:31:37 AM UTC+1 k...@google.com wrote:
>
>> On the write side, you write your mult-GB data using normal writes, then
>> atomic.Store for the final flag uint. On the read side, you use an
>> atomic.Load for the flag uint followed by regular loads for the remaining
>> multi-GB of data.
>> Reading a particular flag value ensures that the following loads see all
>> the writes from before the writer wrote that particular flag value. This is
>> guaranteed by the memory model, as the atomic read seeing the atomic write
>> introduces the synchronized-before edge you need.
>>
>> I agree that the Go memory model doesn't directly address multi-process
>> communication like this, but assuming both ends are Go this is guaranteed
>> to work by the Go memory model. YMMV on what operations/barriers/etc. you
>> need in other languages.
>>
>> On Saturday, January 21, 2023 at 1:46:09 PM UTC-8 bse...@computer.org
>> wrote:
>>
>>> On Sat, Jan 21, 2023 at 12:11 PM Peter Rabbitson (ribasushi) <
>>> riba...@gmail.com> wrote:
>>>
 On Saturday, January 21, 2023 at 7:48:12 PM UTC+1 bse...@computer.org
 wrote:
 On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson 
 wrote:
 Greetings,

 I am trying to understand the exact mechanics of memory write ordering
 from within the same goroutine. I wrote a self-contained runnable example
 with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and
 am copying its header here:

 // Below is a complete example, with the question starting on line 38:
 // how do I ensure that a *separate Linux OS process* observing
 `IPCfile`
 // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
 // The only permissible states are:
 // 1. no changes visible
 // 2. only W1 is visible
 // 3. both W1 and W2 are visible

 This is based on my interpretation of the go memory model:

 Atomic memory operations are sequentially consistent, so here:

  (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
 (*mmapBufAtomic.Load())[0] = 42// W2

 The first atomic load happens before the second load. That also implies
 the first write (W1) happens before the second (W2). However, there is no
 guarantee that W2 will be observed by another goroutine.

 This is perfectly acceptable ( see point 2. above ). Also note that
 there is no other goroutine that is looking at this: the observers are
 separate ( possibly not even go-based ) OS processes. I am strictly trying
 to get to a point where the writer process exemplified in the playground
 will issue the CPU write instructions in the order I expect.


 I think what is really needed here is an atomic store byte operation.
 If this is the only goroutine writing to this buffer, you can emulate that
 by atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32

 This would not be viable: the W1 write is a single byte for the sake of
 brevity. In practice it will be a multi-GiB write, with a multi-KiB  write
 following it, followed by a single-UInt write. All part of a lock-free
 "ratcheted" transactional implementation, allowing for incomplete writes,
 but no dirty reads - the "root pointer" is the last thing being updated, so
 an observer process sees "old state" or "new state" and nothing inbetween.
 This is why my quest to understand the precise behavior and guarantees of
 the resulting compiled program.

>>>
>>>
>>> You realize, if W1 is a multi-GB write, another process will
>>> observe partial writes for W1. But, I believe, if another process observes
>>> W2, then it is guaranteed that all of W1 is written.
>>>
>>> I think the Go memory model does not really apply here, because you are
>>> talking about other processes reading shared memory. What you are 

Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread Peter Rabbitson (ribasushi)
This question is focused exclusively on the writer side.

So are you saying that this will also work (based 
on https://go.dev/play/p/ZXMg_Qq3ygF )
mmapBufRaw[fSize-1] = 255// W1
(*mmapBufAtomic.Load())[0] = 42  // W2

How about this, would that work as a "everything before the atomic. has to 
appear as a CPU instruction 1st ?
mmapBufRaw[fSize-1] = 255   // W1
atomic.LoadInt64(randomVal) // any atomic access acts as barrier
mmapBufRaw[0] = 42  // W2

This is the exact mechanism I am trying to understand - what is the minimum 
that golang needs to guarantee "as written" order synchronization, within a 
specific single goroutine.

On Sunday, January 22, 2023 at 2:31:37 AM UTC+1 k...@google.com wrote:

> On the write side, you write your mult-GB data using normal writes, then 
> atomic.Store for the final flag uint. On the read side, you use an 
> atomic.Load for the flag uint followed by regular loads for the remaining 
> multi-GB of data.
> Reading a particular flag value ensures that the following loads see all 
> the writes from before the writer wrote that particular flag value. This is 
> guaranteed by the memory model, as the atomic read seeing the atomic write 
> introduces the synchronized-before edge you need.
>
> I agree that the Go memory model doesn't directly address multi-process 
> communication like this, but assuming both ends are Go this is guaranteed 
> to work by the Go memory model. YMMV on what operations/barriers/etc. you 
> need in other languages.
>
> On Saturday, January 21, 2023 at 1:46:09 PM UTC-8 bse...@computer.org 
> wrote:
>
>> On Sat, Jan 21, 2023 at 12:11 PM Peter Rabbitson (ribasushi) <
>> riba...@gmail.com> wrote:
>>
>>> On Saturday, January 21, 2023 at 7:48:12 PM UTC+1 bse...@computer.org 
>>> wrote:
>>> On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson  
>>> wrote:
>>> Greetings, 
>>>
>>> I am trying to understand the exact mechanics of memory write ordering 
>>> from within the same goroutine. I wrote a self-contained runnable example 
>>> with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and 
>>> am copying its header here:
>>>
>>> // Below is a complete example, with the question starting on line 38:
>>> // how do I ensure that a *separate Linux OS process* observing `IPCfile`
>>> // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
>>> // The only permissible states are:
>>> // 1. no changes visible
>>> // 2. only W1 is visible
>>> // 3. both W1 and W2 are visible
>>>
>>> This is based on my interpretation of the go memory model:
>>>
>>> Atomic memory operations are sequentially consistent, so here:
>>>
>>>  (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
>>> (*mmapBufAtomic.Load())[0] = 42// W2
>>>
>>> The first atomic load happens before the second load. That also implies 
>>> the first write (W1) happens before the second (W2). However, there is no 
>>> guarantee that W2 will be observed by another goroutine.
>>>
>>> This is perfectly acceptable ( see point 2. above ). Also note that 
>>> there is no other goroutine that is looking at this: the observers are 
>>> separate ( possibly not even go-based ) OS processes. I am strictly trying 
>>> to get to a point where the writer process exemplified in the playground 
>>> will issue the CPU write instructions in the order I expect.
>>>  
>>>
>>> I think what is really needed here is an atomic store byte operation. If 
>>> this is the only goroutine writing to this buffer, you can emulate that by 
>>> atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32
>>>
>>> This would not be viable: the W1 write is a single byte for the sake of 
>>> brevity. In practice it will be a multi-GiB write, with a multi-KiB  write 
>>> following it, followed by a single-UInt write. All part of a lock-free 
>>> "ratcheted" transactional implementation, allowing for incomplete writes, 
>>> but no dirty reads - the "root pointer" is the last thing being updated, so 
>>> an observer process sees "old state" or "new state" and nothing inbetween. 
>>> This is why my quest to understand the precise behavior and guarantees of 
>>> the resulting compiled program.
>>>
>>
>>
>> You realize, if W1 is a multi-GB write, another process will 
>> observe partial writes for W1. But, I believe, if another process observes 
>> W2, then it is guaranteed that all of W1 is written.
>>
>> I think the Go memory model does not really apply here, because you are 
>> talking about other processes reading shared memory. What you are really 
>> relying on is that on x86, there will be a memory barrier associated with 
>> atomic loads. I don't know how this works on arm. I am not sure how 
>> portable this solution would be. The memory model is explicit about 
>> observing the effects of an atomic write operation, and sequential 
>> consistency of atomic memory operations. So it sounds like an unprotected 
>> W1 followed by an atomic store of W2 would still 

Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread 'Keith Randall' via golang-nuts
On the write side, you write your mult-GB data using normal writes, then 
atomic.Store for the final flag uint. On the read side, you use an 
atomic.Load for the flag uint followed by regular loads for the remaining 
multi-GB of data.
Reading a particular flag value ensures that the following loads see all 
the writes from before the writer wrote that particular flag value. This is 
guaranteed by the memory model, as the atomic read seeing the atomic write 
introduces the synchronized-before edge you need.

I agree that the Go memory model doesn't directly address multi-process 
communication like this, but assuming both ends are Go this is guaranteed 
to work by the Go memory model. YMMV on what operations/barriers/etc. you 
need in other languages.

On Saturday, January 21, 2023 at 1:46:09 PM UTC-8 bse...@computer.org wrote:

> On Sat, Jan 21, 2023 at 12:11 PM Peter Rabbitson (ribasushi) <
> riba...@gmail.com> wrote:
>
>> On Saturday, January 21, 2023 at 7:48:12 PM UTC+1 bse...@computer.org 
>> wrote:
>> On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson  
>> wrote:
>> Greetings, 
>>
>> I am trying to understand the exact mechanics of memory write ordering 
>> from within the same goroutine. I wrote a self-contained runnable example 
>> with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am 
>> copying its header here:
>>
>> // Below is a complete example, with the question starting on line 38:
>> // how do I ensure that a *separate Linux OS process* observing `IPCfile`
>> // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
>> // The only permissible states are:
>> // 1. no changes visible
>> // 2. only W1 is visible
>> // 3. both W1 and W2 are visible
>>
>> This is based on my interpretation of the go memory model:
>>
>> Atomic memory operations are sequentially consistent, so here:
>>
>>  (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
>> (*mmapBufAtomic.Load())[0] = 42// W2
>>
>> The first atomic load happens before the second load. That also implies 
>> the first write (W1) happens before the second (W2). However, there is no 
>> guarantee that W2 will be observed by another goroutine.
>>
>> This is perfectly acceptable ( see point 2. above ). Also note that there 
>> is no other goroutine that is looking at this: the observers are separate ( 
>> possibly not even go-based ) OS processes. I am strictly trying to get to a 
>> point where the writer process exemplified in the playground will issue the 
>> CPU write instructions in the order I expect.
>>  
>>
>> I think what is really needed here is an atomic store byte operation. If 
>> this is the only goroutine writing to this buffer, you can emulate that by 
>> atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32
>>
>> This would not be viable: the W1 write is a single byte for the sake of 
>> brevity. In practice it will be a multi-GiB write, with a multi-KiB  write 
>> following it, followed by a single-UInt write. All part of a lock-free 
>> "ratcheted" transactional implementation, allowing for incomplete writes, 
>> but no dirty reads - the "root pointer" is the last thing being updated, so 
>> an observer process sees "old state" or "new state" and nothing inbetween. 
>> This is why my quest to understand the precise behavior and guarantees of 
>> the resulting compiled program.
>>
>
>
> You realize, if W1 is a multi-GB write, another process will 
> observe partial writes for W1. But, I believe, if another process observes 
> W2, then it is guaranteed that all of W1 is written.
>
> I think the Go memory model does not really apply here, because you are 
> talking about other processes reading shared memory. What you are really 
> relying on is that on x86, there will be a memory barrier associated with 
> atomic loads. I don't know how this works on arm. I am not sure how 
> portable this solution would be. The memory model is explicit about 
> observing the effects of an atomic write operation, and sequential 
> consistency of atomic memory operations. So it sounds like an unprotected 
> W1 followed by an atomic store of W2 would still work the same way.
>  
>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "golang-nuts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to golang-nuts...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/golang-nuts/c23d512e-a307-4f4d-bf23-74398c5cf42bn%40googlegroups.com
>>  
>> 
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 

Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread burak serdar
On Sat, Jan 21, 2023 at 12:11 PM Peter Rabbitson (ribasushi) <
ribasu...@gmail.com> wrote:

> On Saturday, January 21, 2023 at 7:48:12 PM UTC+1 bse...@computer.org
> wrote:
> On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson 
> wrote:
> Greetings,
>
> I am trying to understand the exact mechanics of memory write ordering
> from within the same goroutine. I wrote a self-contained runnable example
> with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am
> copying its header here:
>
> // Below is a complete example, with the question starting on line 38:
> // how do I ensure that a *separate Linux OS process* observing `IPCfile`
> // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
> // The only permissible states are:
> // 1. no changes visible
> // 2. only W1 is visible
> // 3. both W1 and W2 are visible
>
> This is based on my interpretation of the go memory model:
>
> Atomic memory operations are sequentially consistent, so here:
>
>  (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
> (*mmapBufAtomic.Load())[0] = 42// W2
>
> The first atomic load happens before the second load. That also implies
> the first write (W1) happens before the second (W2). However, there is no
> guarantee that W2 will be observed by another goroutine.
>
> This is perfectly acceptable ( see point 2. above ). Also note that there
> is no other goroutine that is looking at this: the observers are separate (
> possibly not even go-based ) OS processes. I am strictly trying to get to a
> point where the writer process exemplified in the playground will issue the
> CPU write instructions in the order I expect.
>
>
> I think what is really needed here is an atomic store byte operation. If
> this is the only goroutine writing to this buffer, you can emulate that by
> atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32
>
> This would not be viable: the W1 write is a single byte for the sake of
> brevity. In practice it will be a multi-GiB write, with a multi-KiB  write
> following it, followed by a single-UInt write. All part of a lock-free
> "ratcheted" transactional implementation, allowing for incomplete writes,
> but no dirty reads - the "root pointer" is the last thing being updated, so
> an observer process sees "old state" or "new state" and nothing inbetween.
> This is why my quest to understand the precise behavior and guarantees of
> the resulting compiled program.
>


You realize, if W1 is a multi-GB write, another process will
observe partial writes for W1. But, I believe, if another process observes
W2, then it is guaranteed that all of W1 is written.

I think the Go memory model does not really apply here, because you are
talking about other processes reading shared memory. What you are really
relying on is that on x86, there will be a memory barrier associated with
atomic loads. I don't know how this works on arm. I am not sure how
portable this solution would be. The memory model is explicit about
observing the effects of an atomic write operation, and sequential
consistency of atomic memory operations. So it sounds like an unprotected
W1 followed by an atomic store of W2 would still work the same way.


> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/c23d512e-a307-4f4d-bf23-74398c5cf42bn%40googlegroups.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMV2RqowLALwe8yHsOLROOi5OFFY8sn9y4hdFYDgLyN9GnssLA%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread Peter Rabbitson
( apologies for the previous mangled message, re-posting from a saner UI )

On Sat, Jan 21, 2023 at 7:47 PM burak serdar  wrote:

>
>
> On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson 
> wrote:
>
>> Greetings,
>>
>> I am trying to understand the exact mechanics of memory write ordering
>> from within the same goroutine. I wrote a self-contained runnable example
>> with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am
>> copying its header here:
>>
>> // Below is a complete example, with the question starting on line 38:
>> // how do I ensure that a *separate Linux OS process* observing `IPCfile`
>> // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
>> // The only permissible states are:
>> // 1. no changes visible
>> // 2. only W1 is visible
>> // 3. both W1 and W2 are visible
>>
>
> This is based on my interpretation of the go memory model:
>
> Atomic memory operations are sequentially consistent, so here:
>
>  (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
> (*mmapBufAtomic.Load())[0] = 42// W2
>
> The first atomic load happens before the second load. That also implies
> the first write (W1) happens before the second (W2). However, there is no
> guarantee that W2 will be observed by another goroutine.
>

This is perfectly acceptable ( see point 2. above ). Also note that there
is no other goroutine that is looking at this: the observers are separate (
possibly not even go-based ) OS processes. I am strictly trying to get to a
point where the writer process exemplified in the playground will issue the
CPU write instructions in the order I expect.


> I think what is really needed here is an atomic store byte operation. If
> this is the only goroutine writing to this buffer, you can emulate that by
> atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32
>

This would not be viable: the W1 write is a single byte for the sake of
brevity. In practice it will be a multi-GiB write, with a multi-KiB write
following it, followed by a single-UInt write. All part of a lock-free
"ratcheted" transactional implementation, allowing for incomplete writes,
but no dirty reads - the "root pointer" is the last thing being updated, so
an observer process sees "old state" or "new state" and nothing inbetween.
This is why my quest to understand the precise behavior and guarantees of
the resulting compiled program.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTSLmXZjPBLoc-o8bbenY4Zo-jBdL2zYnKCN5SPceHkccEA%40mail.gmail.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread Peter Rabbitson (ribasushi)
On Saturday, January 21, 2023 at 7:48:12 PM UTC+1 bse...@computer.org wrote:
On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson  wrote:
Greetings, 

I am trying to understand the exact mechanics of memory write ordering from 
within the same goroutine. I wrote a self-contained runnable example with 
the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am copying 
its header here:

// Below is a complete example, with the question starting on line 38:
// how do I ensure that a *separate Linux OS process* observing `IPCfile`
// (either via pread() or mmap()) can *NEVER* observe W2 before W1.
// The only permissible states are:
// 1. no changes visible
// 2. only W1 is visible
// 3. both W1 and W2 are visible

This is based on my interpretation of the go memory model:

Atomic memory operations are sequentially consistent, so here:

 (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
(*mmapBufAtomic.Load())[0] = 42// W2

The first atomic load happens before the second load. That also implies the 
first write (W1) happens before the second (W2). However, there is no 
guarantee that W2 will be observed by another goroutine.

This is perfectly acceptable ( see point 2. above ). Also note that there 
is no other goroutine that is looking at this: the observers are separate ( 
possibly not even go-based ) OS processes. I am strictly trying to get to a 
point where the writer process exemplified in the playground will issue the 
CPU write instructions in the order I expect.
 

I think what is really needed here is an atomic store byte operation. If 
this is the only goroutine writing to this buffer, you can emulate that by 
atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32

This would not be viable: the W1 write is a single byte for the sake of 
brevity. In practice it will be a multi-GiB write, with a multi-KiB  write 
following it, followed by a single-UInt write. All part of a lock-free 
"ratcheted" transactional implementation, allowing for incomplete writes, 
but no dirty reads - the "root pointer" is the last thing being updated, so 
an observer process sees "old state" or "new state" and nothing inbetween. 
This is why my quest to understand the precise behavior and guarantees of 
the resulting compiled program.

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/c23d512e-a307-4f4d-bf23-74398c5cf42bn%40googlegroups.com.


Re: [go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread burak serdar
On Sat, Jan 21, 2023 at 10:36 AM Peter Rabbitson 
wrote:

> Greetings,
>
> I am trying to understand the exact mechanics of memory write ordering
> from within the same goroutine. I wrote a self-contained runnable example
> with the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am
> copying its header here:
>
> // Below is a complete example, with the question starting on line 38:
> // how do I ensure that a *separate Linux OS process* observing `IPCfile`
> // (either via pread() or mmap()) can *NEVER* observe W2 before W1.
> // The only permissible states are:
> // 1. no changes visible
> // 2. only W1 is visible
> // 3. both W1 and W2 are visible
>

This is based on my interpretation of the go memory model:

Atomic memory operations are sequentially consistent, so here:

 (*mmapBufAtomic.Load())[fSize-1] = 255 // W1
(*mmapBufAtomic.Load())[0] = 42// W2

The first atomic load happens before the second load. That also implies the
first write (W1) happens before the second (W2). However, there is no
guarantee that W2 will be observed by another goroutine.

I think what is really needed here is an atomic store byte operation. If
this is the only goroutine writing to this buffer, you can emulate that by
atomic.LoadUint32, set the highest/lowest byte, then atomic.StoreUint32



>
> I did read through https://go.dev/ref/mem and
> https://github.com/golang/go/discussions/47141 + links, but could not
> find a definitive answer to my specific use-case.
>
> Would really appreciate any help getting to the bottom of this!
>
> --
> You received this message because you are subscribed to the Google Groups
> "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/CAMrvTSKXb5JQMR9PcCXwYhcT4rq8O_5hiTHrOChk6sUeOrbagw%40mail.gmail.com
> 
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMV2RqpAFDV9DJcydk3DD8%3DkdizJygBirq6Ub-VkKV2xK120Lw%40mail.gmail.com.


[go-nuts] Clarification of memory model behavior within a single goroutine

2023-01-21 Thread Peter Rabbitson
Greetings,

I am trying to understand the exact mechanics of memory write ordering from
within the same goroutine. I wrote a self-contained runnable example with
the question inlined here: https://go.dev/play/p/ZXMg_Qq3ygF and am copying
its header here:

// Below is a complete example, with the question starting on line 38:
// how do I ensure that a *separate Linux OS process* observing `IPCfile`
// (either via pread() or mmap()) can *NEVER* observe W2 before W1.
// The only permissible states are:
// 1. no changes visible
// 2. only W1 is visible
// 3. both W1 and W2 are visible

I did read through https://go.dev/ref/mem and
https://github.com/golang/go/discussions/47141 + links, but could not find
a definitive answer to my specific use-case.

Would really appreciate any help getting to the bottom of this!

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CAMrvTSKXb5JQMR9PcCXwYhcT4rq8O_5hiTHrOChk6sUeOrbagw%40mail.gmail.com.