Re: Synchronisation help

2024-01-03 Thread Anonymouse via Digitalmars-d-learn
On Tuesday, 2 January 2024 at 18:01:55 UTC, Jonathan M Davis 
wrote:

[...]


That clarifies a lot. If nothing else, I realise now I can't have 
`opBinaryRight(string op : "in")` the way I had hoped.


What I have now probably doesn't cover 100% of every use-case, 
but it should do for my scope. I'm much more confident using this 
compared to the naked `shared` AA I started with.


Thanks everyone for the help.


Re: Synchronisation help

2024-01-02 Thread user1234 via Digitalmars-d-learn

On Tuesday, 2 January 2024 at 11:39:12 UTC, Anonymouse wrote:

On Tuesday, 2 January 2024 at 11:05:33 UTC, user1234 wrote:
Do not use `shared` AA. Use `__gshared` + sync primitives. 
`shared` AA will lead to all sort of bugs:


- https://issues.dlang.org/show_bug.cgi?id=20484#c1
- https://issues.dlang.org/show_bug.cgi?id=17088
- https://issues.dlang.org/show_bug.cgi?id=16597
- etc.


Hmm, I see.

Is `shared` safe to use with AAs *provided* I use sync 
primitives, or should I favour `__gshared` over `shared`? I was 
under the impression `__gshared` was only really meant for 
interfacing with C.


The semantics of __gshared in this context is "no-TLS". You meant 
to access the AA in several threads right ? So you dont want each 
thread to have its own copy.


Re: Synchronisation help

2024-01-02 Thread Jonathan M Davis via Digitalmars-d-learn
On Tuesday, January 2, 2024 3:41:55 AM MST Anonymouse via Digitalmars-d-learn 
wrote:
> On Monday, 1 January 2024 at 19:49:28 UTC, Jonathan M Davis wrote:
> > [...]
>
> Thank you. Yes, `Foo` is a class for the purposes of inheritance
> -- I left that out of the example.
>
> So a completely valid solution is to write a struct wrapper
> around an AA of the type I need, overload the required operators,
> and then just drop-in replace the current AA? All array
> operations would then transparently be between lock and unlock
> statements.
> ...
> I tried this and it seems to work. Is it glaringly incorrect
> somehow, or am I free to roll with this?

For a member function to work on a shared object, that member function must
be marked as shared. For most types, of course, it's completely
inappropriate to mark a member function as shared, since those types do
nothing with thread synchronization, but for types which are specifically
designed to work across threads, it's appropriate to mark the member
functions as shared and then to have them internally do whatever they need
to do to protect access across threads. So, if you want to create a type
which wraps an AA to make it thread-safe, it would be appropriate to make
all of its member functions shared and then deal with the locking primitives
internally.

That being said, as Christian pointed out, whether locking at the operation
level is the best way to deal with thread synchronization depends on what
you're doing. For instance, doing something like

if (auto value = key in aa)
{
// do stuff with *value
}

would be seriously problematic if you simply wrapped the AA, because you'd
then have a pointer to a value in the AA which might not even be in the AA
anymore when you try to do something with it (since another thread could
have happily mutated the AA via another function call). In addition, it's
problematic if in returns a thread-local pointer, since it's referencing
shared data. So, it's a bug for it to be returning a thread-local pointer.
It either needs to be returning a shared pointer (meaning that the locking
primitives need to be at a higher level), or you need to copy the data out
rather than return a pointer (as well as ensuring that the data itself is
fully thread-local, which could be problematic with reference types).

For something like this, locking the mutex for a whole set of operations
makes more sense, and if you're doing that, you probably don't want a struct
or class which simply wraps the AA. Rather, you'd want to have whatever code
was operating on the AA to be handling the locking - which would often be
inside of a struct or class that has the AA as a shared member variable. So,
all of the code that uses the AA would be encapsulated, but you wouldn't
have created a type that's simply wrapping the AA.

What you'd typically do would probably be one of two approaches:

1. Create a type which handles all of the threading stuff internally
(including spawning the other thread) and which provides an API to the main
thread which is thread-local in the sense that the main thread doesn't have
to know or care about the threading that's being done internally.

2. Create a type which is passed from one thread to another and then
designed to be used across threads in a thread-safe manner where any
operation that you can do on that type is marked as shared and designed to
be thread-safe, which would typically mean having the operations being high
enough level that the caller doesn't have to worry about the synchronization
primitives at all, though of course, depending on what you're doing, you
might have to expose more. Either way, the idea would be to make it so that
that shared object is handling as much of the threading synchronization
stuff as possible, and when it doesn't, it provides the means for the code
using it to use the thread synchronization mechanisms on the data in as
simple and safe a manner as possible.

You could of course have a much messier approach where nothing is really
wrapped, but that makes it much harder to manage the code, whereas it will
usually work much better if you can encapsulate the stuff that has to deal
with shared. But you do have to pick an encapsulation level which allows you
to actually protect the data, which isn't likely to be the case at the level
of the AA itself but rather at a higher level which is using the AA to do
stuff.

Ultimately, what you do with sharing data across threads in D is essentially
what you'd do in a language like C++. It's just that D's shared makes it
clear to the type system what's being shared across threads and what's
thread-local so that the type system can assume that most stuff is
thread-local as well as prevent you from accessing shared data accidentally
(whereas in C++, you only know what's thread-local and what isn't by
convention, since normally, _everything_ is shared across threads but only a
small portion of it is actusally used by multiple threads). So, we're able
to 

Re: Synchronisation help

2024-01-02 Thread Jonathan M Davis via Digitalmars-d-learn
On Tuesday, January 2, 2024 4:39:12 AM MST Anonymouse via Digitalmars-d-learn 
wrote:
> On Tuesday, 2 January 2024 at 11:05:33 UTC, user1234 wrote:
> > Do not use `shared` AA. Use `__gshared` + sync primitives.
> > `shared` AA will lead to all sort of bugs:
> >
> > - https://issues.dlang.org/show_bug.cgi?id=20484#c1
> > - https://issues.dlang.org/show_bug.cgi?id=17088
> > - https://issues.dlang.org/show_bug.cgi?id=16597
> > - etc.
>
> Hmm, I see.
>
> Is `shared` safe to use with AAs *provided* I use sync
> primitives, or should I favour `__gshared` over `shared`? I was
> under the impression `__gshared` was only really meant for
> interfacing with C.

You should almost never use __gshared. It's really only intended to be used
with C global variables.

Some folks use __gshared, because they don't like the restrictions that
shared places on your code, but the restrictions are there to protect you
from accessing shared data when it's not properly protected. In general,
what code should be doing is marking variables as shared so that you cannot
accidentally access the data, and then in the sections of code where you've
properly protected access to the data, you temporarily cast away shared to
operate on it. This is obviously a tad annoying, which is why some folks
then just use __gshared to shut up the compiler, but it's very much on
purpose that things work this way, and if you mark a variable as __gshared,
the type system treats it as thread-local, and it's never caught when you
try to access the variable without first dealing with the proper
synchronization primitives.

Unless a type is specifically designed to work as shared (e.g. a class or
struct with shared member functions which do all of the appropriate locking
and casting internally), it's expected that you're going to have to either
cast away shared or use atomics to operate on shared variables of that type.
And AAs are designed to be thread-local, so they have no locking mechanisms
built in, and you have to deal with the locking primitives yourself as well
as casting away shared to then operate on the AA while it's protected. It's
a bug when you can do pretty much anything with a shared AA other than pass
it around without casting away shared first.

- Jonathan M Davis





Re: Synchronisation help

2024-01-02 Thread Christian Köstlin via Digitalmars-d-learn

On Tuesday, 2 January 2024 at 10:41:55 UTC, Anonymouse wrote:
On Monday, 1 January 2024 at 19:49:28 UTC, Jonathan M Davis 
wrote:

[...]


Thank you. Yes, `Foo` is a class for the purposes of 
inheritance -- I left that out of the example.


So a completely valid solution is to write a struct wrapper 
around an AA of the type I need, overload the required 
operators, and then just drop-in replace the current AA? All 
array operations would then transparently be between lock and 
unlock statements.


[...]

I guess this can break synchronisation between the two if I 
replace the `Mutex` in either thread. Are there any other 
obvious caveats?


It might be, that this locking scheme is too narrow. E.g. you 
might want to have an "atomic" testAndSet on the AA e.g. check if 
an element is in and iff its not in put it there.


Kind regards,
Christian


Re: Synchronisation help

2024-01-02 Thread Anonymouse via Digitalmars-d-learn

On Tuesday, 2 January 2024 at 11:05:33 UTC, user1234 wrote:
Do not use `shared` AA. Use `__gshared` + sync primitives. 
`shared` AA will lead to all sort of bugs:


- https://issues.dlang.org/show_bug.cgi?id=20484#c1
- https://issues.dlang.org/show_bug.cgi?id=17088
- https://issues.dlang.org/show_bug.cgi?id=16597
- etc.


Hmm, I see.

Is `shared` safe to use with AAs *provided* I use sync 
primitives, or should I favour `__gshared` over `shared`? I was 
under the impression `__gshared` was only really meant for 
interfacing with C.


Re: Synchronisation help

2024-01-02 Thread user1234 via Digitalmars-d-learn

On Monday, 1 January 2024 at 15:48:16 UTC, Anonymouse wrote:
I have a `shared string[int]` AA that I access from two 
different threads. The function I spawn to start the second 
thread takes the AA as an argument.


[...]

What is the common solution here? Do I add a module-level 
`Object thing` and move everything accessing the AA into 
`synchronized(.thing)` statements? Or maybe add a `shared 
static` something to `Foo` and synchronise with 
`synchronize(Foo.thing)`?


Do not use `shared` AA. Use `__gshared` + sync primitives. 
`shared` AA will lead to all sort of bugs:


- https://issues.dlang.org/show_bug.cgi?id=20484#c1
- https://issues.dlang.org/show_bug.cgi?id=17088
- https://issues.dlang.org/show_bug.cgi?id=16597
- etc.


Re: Synchronisation help

2024-01-02 Thread Anonymouse via Digitalmars-d-learn

On Monday, 1 January 2024 at 19:49:28 UTC, Jonathan M Davis wrote:

[...]


Thank you. Yes, `Foo` is a class for the purposes of inheritance 
-- I left that out of the example.


So a completely valid solution is to write a struct wrapper 
around an AA of the type I need, overload the required operators, 
and then just drop-in replace the current AA? All array 
operations would then transparently be between lock and unlock 
statements.


```d
struct MutexedAA(AA : V[K], V, K)
{
import core.sync.mutex : Mutex;

shared Mutex mutex;

shared AA aa;

void setup() nothrow
{
mutex = new shared Mutex;

mutex.lock_nothrow();

if (K.init !in (cast()aa))
{
(cast()aa)[K.init] = V.init;
(cast()aa).remove(K.init);
}

mutex.unlock_nothrow();
}

auto opIndexAssign(V value, K key)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
(cast()aa)[key] = value;
mutex.unlock_nothrow();
return value;
}

auto opIndex(K key)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
auto value = (cast()aa)[key];
mutex.unlock_nothrow();
return value;
}

auto opBinaryRight(string op : "in")(K key)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
auto value = key in cast()aa;
mutex.unlock_nothrow();
return value;
}

auto remove(K key)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
auto value = (cast()aa).remove(key);
mutex.unlock_nothrow();
return value;
}

auto opEquals()(auto ref typeof(this) other)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
auto isEqual = (cast()aa == cast()(other.aa));
mutex.unlock_nothrow();
return isEqual;
}

auto opEquals()(auto ref AA other)
in (mutex, typeof(this).stringof ~ " has null Mutex")
{
mutex.lock_nothrow();
auto isEqual = (cast()aa == other);
mutex.unlock_nothrow();
return isEqual;
}
}
```

(https://gist.github.com/zorael/433c50f238b21b9bb68d076d8a495045)

I tried this and it seems to work. Is it glaringly incorrect 
somehow, or am I free to roll with this?


You mention passing a `shared Foo*`. In the gist I pass the 
instance of the `MutexedAA!(string[int])` to the worker thread 
*by value* instead of as something `shared`, since I couldn't get 
operator overloading to work when `shared`. (Calling 
`sharedAA.opIndexAssign("hello", 42)` worked, but `sharedAA[42] = 
"hello"` wouldn't compile.)


I guess this can break synchronisation between the two if I 
replace the `Mutex` in either thread. Are there any other obvious 
caveats?


Re: Synchronisation help

2024-01-01 Thread Steven Schveighoffer via Digitalmars-d-learn

On Monday, 1 January 2024 at 15:48:16 UTC, Anonymouse wrote:
What is the common solution here? Do I add a module-level 
`Object thing` and move everything accessing the AA into 
`synchronized(.thing)` statements? Or maybe add a `shared 
static` something to `Foo` and synchronise with 
`synchronize(Foo.thing)`?


Yeah, and the thing should be a `Mutex` object. A `Mutex` object 
uses it's underlying mutex primitive as its monitor. This also 
gives options for usage with methods as well as `synchronized` 
statements.


Just make sure you mark it `__gshared` or `shared` so all threads 
see it.


-Steve


Re: Synchronisation help

2024-01-01 Thread Jonathan M Davis via Digitalmars-d-learn
On Monday, January 1, 2024 8:48:16 AM MST Anonymouse via Digitalmars-d-learn 
wrote:
> I have a `shared string[int]` AA that I access from two different
> threads. The function I spawn to start the second thread takes
> the AA as an argument.
>
> ```d
> class Foo
> {
>  shared string[int] bucket;
>  Tid worker;
> }
>
> void workerFn(shared string[int] bucket)
> {
>  while (true)
>  {
>  // occasionally reads, occasionally modifies bucket
>  }
> }
>
> void main()
> {
>  auto foo = new Foo;
>  foo.bucket[0] = string.init;
>  foo.bucket.remove(0);
>  foo.worker = spawn(, foo.bucket);
>
>  while (true)
>  {
>  // occasionally reads, occasionally modifies bucket
>  }
> }
> ```
>
> (`run.dlang.io` shortening seems broken again, but I made a
> [gist](https://gist.github.com/zorael/17b042c424cfea5ebb5f1f3120f983f4) of a
> more complete example.)
>
> Reading the specs on `synchronized` statements, it seems I need
> to provide an `Object` to base synchronisation on when two
> *different* places in the code needs synchronising, whereas if
> it's in the same place an expressionless `synchronize { }` will
> do.
>
> The worker function can't see `Foo foo` inside `main`, so it
> can't share synchronisation on that.
>
> What is the common solution here? Do I add a module-level `Object
> thing` and move everything accessing the AA into
> `synchronized(.thing)` statements? Or maybe add a `shared static`
> something to `Foo` and synchronise with `synchronize(Foo.thing)`?

In general, I would advise against using synchronized statements. They
really don't add anything, particularly since in many cases, you need access
to more complex thread-synchronization facilities anyway (e.g. condition
variables). Really, synchronized statements are just a Java-ism that D got
fairly early on that were arguably a mistake. So, I'd typically suggest that
folks just use Mutex from core.sync.mutex directly (though you can certainly
use them if you don't need to do anything more complex).

https://dlang.org/phobos/core_sync_mutex.html

If you're using synchronized statements, you're essentially just using
syntax which does that underneath the hood without providing you the
functionality to use stuff like Condition from core.sync.condition.

https://dlang.org/phobos/core_sync_condition.html

However, regardless of whether you use synchronized or use Mutex directly,
what you need to do is to have an object that functions as a mutex to
protect the shared data, locking it whenever you access it so that only one
thread can access it at a time.

The best place to put that mutex varies depending on what your code is
doing. A shared static variable could make sense, but it's often the case
that you would put the mutex inside the class or struct that contains the
data that's shared across threads. Or if you don't have a type that's
intended to encompass what you're doing with the shared data, then it often
makes sense to create one to hold the shared data so that the code that's
using it doesn't have to deal with the synchronization mechanisms but rather
all of that mess is contained entirely within the class or struct that
you're passing around. But even if you don't want to encapsulate it all
within a struct or class, simply creating one to hold both the shared data
and the mutex makes it so that they'll be together wherever you're passing
them around, making it easy for the code using the AA to access the mutex.

However, because you're not supposed to actually be mutating data while it's
shared (and the compiler largely prevents you from doing so), what you
generally need to do to operate on shared data is to lock the mutex that
protects it, cast away shared so that you can operate on the data, do
whatever it is that you need to do with the now thread-local data, make sure
that no thread-local references to the data exist any longer, and then lock
the mutex again. And to do that cleanly, it's often nice to create a struct
or class with shared member functions which takes care of all of that for
you so that that particular dance is encapsulated rather than having to deal
with any code that has access to that shared data having to deal with the
synchronization correctly.

Given that you already have a class called Foo which contains the AA, I
would say that the most obvious thing to do would be to just pass a shared
Foo across threads rather than pass the AA from inside Foo. Then you can
either put a mutex in Foo that then naturally gets passed along with the AA,
or you just use the class itself as a mutex - e.g. synchronized(this) {}
IIRC - since classes unfortunately have a mutex built into them to make
synchronized member functions work (which is useful when you want to use
synchronized functions but causes unnecessary bloat for most classes). And
if it makes sense to lock the mutex during entire function calls, you can
just make the member function synchronized rather than having 

Synchronisation help

2024-01-01 Thread Anonymouse via Digitalmars-d-learn
I have a `shared string[int]` AA that I access from two different 
threads. The function I spawn to start the second thread takes 
the AA as an argument.


```d
class Foo
{
shared string[int] bucket;
Tid worker;
}

void workerFn(shared string[int] bucket)
{
while (true)
{
// occasionally reads, occasionally modifies bucket
}
}

void main()
{
auto foo = new Foo;
foo.bucket[0] = string.init;
foo.bucket.remove(0);
foo.worker = spawn(, foo.bucket);

while (true)
{
// occasionally reads, occasionally modifies bucket
}
}
```

(`run.dlang.io` shortening seems broken again, but I made a 
[gist](https://gist.github.com/zorael/17b042c424cfea5ebb5f1f3120f983f4) of a more complete example.)


Reading the specs on `synchronized` statements, it seems I need 
to provide an `Object` to base synchronisation on when two 
*different* places in the code needs synchronising, whereas if 
it's in the same place an expressionless `synchronize { }` will 
do.


The worker function can't see `Foo foo` inside `main`, so it 
can't share synchronisation on that.


What is the common solution here? Do I add a module-level `Object 
thing` and move everything accessing the AA into 
`synchronized(.thing)` statements? Or maybe add a `shared static` 
something to `Foo` and synchronise with `synchronize(Foo.thing)`?