Re: New garbage collector --gc:orc is a joy to use.

2020-06-27 Thread Araq
> Is that the plan?

Almost. We need to run the cycle collector before a send already to ensure the 
thread local "cycle candidates" list is empty (note that it is always empty 
after a cycle collection), otherwise it would interfere with multi-threading. 
Alternatively we can make the list global and protected via a lock (or 
implement it as lockfree queue...).

There are many other options, we can also restrict the sending to `.acyclic` 
types.

Or we request that "orphaned objects" that will be misdetected as false 
external roots to be cleaned up manually. That is a good idea anyhow, ensuring 
the programmer he still aware of the typology:


proc process(x: Node) =
  use(x.left)
  # likely invalid:
  spawn process(x.right)
  # better: extract it
  spawn process(move x.right)



Run


Re: New garbage collector --gc:orc is a joy to use.

2020-06-26 Thread snej
> You traverse the subgraph. In doing so you count the edges (= E) and sum the 
> RC fields (= S). A graph is isolated (sendable to a different thread) if and 
> only if S = E + 1.

Clever! But what if there's an orphaned object cycle elsewhere that has a 
reference to an object in the subgraph? I've read that cycles are only cleaned 
up once in a while, so there's a time window where a dead cycle could still 
exist at the same time that I'm trying to make a cross-process call.

I suppose when you detect a non-isolated subgraph, you could first force a 
cycle collection and then retry the isolation check, to see if it was a false 
positive. Is that the plan?


Re: New garbage collector --gc:orc is a joy to use.

2020-06-26 Thread Sixte
> You traverse the subgraph.

Replace "visit" with "traverse". I had Pony's definition of `iso` in mind, but 
you describe the more general case where backpointers are already included! 
Nice! DAGs become sendable too.

> Reference counting is alias control.

Yes, refcounting is simply superior. It seems that the dynamic check is the 
most efficient way. 


Re: New garbage collector --gc:orc is a joy to use.

2020-06-26 Thread Araq
> How will the iso property be checked dynamically? Well, the object's 
> refcounts will be checked, they all have to be zero (or one, depending of how 
> we count). Now, the owned ref comes into play.

Nah. I mean, you could do it this way, but there is a much better way: You 
traverse the subgraph. In doing so you could the edges (= E) and sum the RC 
fields (= S). A graph is sendable to a different thread if and only if S = E + 
1.


Re: New garbage collector --gc:orc is a joy to use.

2020-06-25 Thread Sixte
> Will the new islolated: block help with making sure I don't touch objects or 
> sub objects without locking them first?

Absolutely. An _isolated_ data structure, e.g. a list of nodes, can only be 
reached by a single reference. That reference gets passed to another thread. No 
locking involved, no atomic updates of refcounts anymore. If the `iso` gets 
deleted by the new owner, then nothing needs to be done on the side of the 
former owner. `iso` allows for moves or delete without notifying anyone else.

How will the `iso` property be checked dynamically? Well, the object's 
refcounts will be checked, they all have to be zero (or one, depending of how 
we count). Now, the `owned ref` comes into play. The dynamic check will visit 
any node that can be reached with owned refs. What should happen with other 
refs, if they are present in the nodes? That's really a good question. 
Basically, they do not belong to the `iso` object anymore. Possibly, they could 
be set to zero (having some implications though). Other solutions are thinkable 
too. The simplest solution by far is a restriction to `owned ref` aka `iso` and 
therefore exclusion of (sharing) refs.

So, the integration of `iso` into an enviroment that allows for shared refs is 
tricky, especially if flexibility is wanted. 


Re: New garbage collector --gc:orc is a joy to use.

2020-06-25 Thread treeform
Yes I "move" the ownership of objects from the main thread to the work threads. 
And I "move" the ownership back from work thread to the main thread. Once and 
objects moves a way from the thread it was created on I will not be touching 
the object or its internal sub objects on that thread. Nim does not have any 
notion of ownership so I have to do that manually.

I don't expect that both threads reading/writing to objects without locking to 
work.

Does that make sense? Please check out the code.

Will the new islolated: block help with making sure I don't touch objects or 
sub objects without locking them first?


Re: New garbage collector --gc:orc is a joy to use.

2020-06-25 Thread Sixte
The `arc` is definitely here for to stay, but the whole story hasn't been told 
yet.

> That is correct but we'll have an isIsolated runtime check for that. There is 
> also a plan for ensuring this at compile-time via an islolated: block. Seems 
> entirely within reach thanks to Nim's effect system.

Will `isIsolated` support backpointers? A strict `iso` object does only contain 
`iso` pointers ("owned" references). But then, only very limited objects can be 
built. E.g. a double linked list would fail ( a specific problem/flaw within 
rust). A DAG would be impossible as well. However, with backpointers (an object 
can only have one backptr, a backptr is "unique" by definition) it would be 
possible to reach any part of a list or a tree, up to the root.

A strict `iso` object doesn't require refcounting (There is nothing to 
count...) .

That said, I am still memorizing about pony's capabilities and their "viewpoint 
adaptation". IMHO, they have to introduce a `isv` capability, an iso-visitor, a 
capability that overcomes the limitations of `iso`. If an `iso` gets sent to 
another thread, the companion `isv` gets destroyed. This basically unlinks any 
connection of the `iso` with the sending thread. The receiving thread has to 
(re)build its own `isv`, if needed. To make this safe, the thread-ID should be 
part of the `iso`. An `isv` establishing a new link to the `iso` checks the 
"foreign" ID and sets the new thread-ID. After that, the `iso` can be used in 
any way in the new owning thread.

`iso` objects can be created, send, and destructed easily, In particular, this 
might be interesting for time-critical applications with extremely limited 
ressources, e.g. on the microcontroller level. 


Re: New garbage collector --gc:orc is a joy to use.

2020-06-25 Thread Araq
> And furthermore, the object you're moving to another thread can't refer to 
> any other objects that have references on the current thread ... this sounds 
> like something that could accidentally lead to difficult-to-discover race 
> conditions if one is not careful!

That is correct but we'll have an `isIsolated` runtime check for that. There is 
also a plan for ensuring this at compile-time via an `islolated:` block. Seems 
entirely within reach thanks to Nim's effect system.


Re: New garbage collector --gc:orc is a joy to use.

2020-06-24 Thread snej
Cool! I'm about to embark on multithreading, now that I've gotten my async 
networking code working on a single thread. Trying to switch over to gc:orc but 
[having a few problems](https://forum.nim-lang.org/t/6485).

> Now you can just pass deeply nested ref objects between threads and it all 
> works.

Is it really that simple? Because as @araq has stated, ARC's retain/release are 
_not_ atomic. That implies to me that a `ref` object can never be used 
concurrently on multiple threads.

So I think by "pass" you mean "move" — the way you've described your code, it 
sounds like the work queues need to use move semantics, so the "push" operation 
takes an object as a `sink` parameter. Is that accurate?


Re: New garbage collector --gc:orc is a joy to use.

2020-06-24 Thread treeform
There is this:

  * [https://nim-lang.org/docs/gc.html](https://nim-lang.org/docs/gc.html)
  * 
[https://www.youtube.com/watch?v=aUJcYTnPWCg](https://www.youtube.com/watch?v=aUJcYTnPWCg)
  * 
[https://www.youtube.com/watch?v=yA32Wxl59wo](https://www.youtube.com/watch?v=yA32Wxl59wo)
  * 
[https://nim-lang.org/araq/destructors.html](https://nim-lang.org/araq/destructors.html)
  * 
[https://nim-lang.org/araq/ownedrefs.html](https://nim-lang.org/araq/ownedrefs.html)




Re: New garbage collector --gc:orc is a joy to use.

2020-06-24 Thread jxy
Are there detailed documents about the gc options in Nim, specifically `arc` 
and `orc`?


New garbage collector --gc:orc is a joy to use.

2020-06-24 Thread treeform
Nim has a new garbage collector called Orc (enabled with --gc:orc). It’s a 
reference counting mechanism with cycle direction. Most important feature of 
--gc:orc is much better support for threads by sharing the heap between them.

Now you can just pass deeply nested ref objects between threads and it all 
works. My threading needs are pretty pedestrian. I basically have a work queue 
with several work threads and I need work done. I need to pass large nested 
objects to the workers and the workers produce large nested data back. The old 
way to do that is with channels, but channels copy their data. Copying data can 
actually be better and faster with “share nothing” concurrency. But it’s really 
bad for my use case of passing around large nested structures. Another way was 
to use pointers but then I was basically writing C with manual allocations and 
deallocations not nim! This is why the new --gc:orc works so much better for me.

You still need to use and understand locks. But it’s not that bad. I just use 
two locks for input queue and output queue. They try to acquire and release - 
hold the locks - for as little as possible. No thread holds more than 1 lock at 
a time.

See my threaded work example here: 
[https://gist.github.com/treeform/3e8c3be53b2999d709dadc2bc2b4e097](https://gist.github.com/treeform/3e8c3be53b2999d709dadc2bc2b4e097)
 (Feedback on how to make it better welcome.)

Before creating objects and passing them between threads was a big issue. 
Default garbage collector (--gc:refc) gives each thread its own heap. With the 
old model objects allocated on one thread had to be deallocated on the same 
thread. This restriction is gone now!

Another big difference is that it’s more deterministic and supports 
distructors. Compilers can also infer where the frees will happen and optimize 
many allocations and deallocations with move semantics (similar to Rust). Sadly 
it can’t optimize all of them a way that is why reference counting exists. Also 
the cycle detector will try to find garbage cycles and free them as well.

This means I do not have to change the way I write code. I don’t have to mark 
my code in any special way and I don’t really have to worry about cycles. The 
new Orc GC is simply better.

This makes the new garbage collector--gc:orc a joy to use.

(If there are any factual errors about the GC let me know.)