> all allocations are shared by default.
@Araq, the above means that all ref's are shared, which means all unowned
"dangling" ref's are shared, which unowned ref's can be passed to several
another threads as long as the calling thread's lifetime is longer than the
threads to which the unowned ref's are passed. In turn, this means that
"refcount" accesses must be atomic when "threads" is on as they are now
multi-thread. This has a pretty big impact on performance for shared unowned
ref's.
I believe that the passing of unowned ref's is a valid use case where the
controlling thread, which often will be the main thread, needs to divide up
work while retaining ownership of the container such might contain such as
string's and/or seq's For the string's/seq's themselves, atomic accessors are
an option that can be enabled or bypassed as per the requirements of inter
thread use, but ref counting is part of the definition for ref's.
I suppose a possible solution is to be able to define =destroy/=copy`=move for
a sub type of unowned ref (not necessary for owned ref because whenever they
deal with ref counts on destruction, it is expected that their thread is the
only access point) when one is going to use it this way so the default can be
not to use atomic refcount's but to add "atomicity" as required. Would this be
possible in your implementation? Can it be done simply? One idea would be to
have an accessor for the operations on refcount that can be overridden,
something like the following:
# defined byteCopy and ByteSwap
template `byteCopy`(dst, src: typed) =
copyMem(`dst`.unsafeAddr, src.unsafeAddr, `src`.sizeof)
template `byteSwap`(dst, src: typed) =
var tmp: `src`.type
copyMem(tmp.unsafeAddr, `src`.unsafeAddr, `src`.sizeof)
copyMem(`src`.unsafeAddr, `dst`.unsafeAddr, `src`.sizeof)
copyMem(`dst`.unsafeAddr, tmp.unsafeAddr, `src`.sizeof)
# with the actual underlying object that contains the refcount as follows
type
RefObj[T] = object
pntr: T
refcount: Natural
proc `inc`[T](ro: var RefObj[T]) {.inline.} = ro.refcount.inc
proc `dec`[T](ro: var RefObj[T]) {.inline.} = ro.refcount.dec
# and the atomically ref counted version
type RefObjAtomic[T] = distinct RefObj[T]
proc `inc`[T](ro: var RefObjAtomic[T]) {.inline.} = ro.refcount.atomicInc
proc `dec`[T](ro: var RefObjAtomic[T]) {.inline.} = ro.refcount.atomicDec
type
Ref[T] = ptr RefObj[T] | ptr RefObjAtomic[T]
OwnedRef[T] = distinct Ref[T]
proc `[]`[T](x: Ref[T] | ownedRef[T]) {.inline.} = cast[ptr T])(x) # no
double deref
# with the default unowned ref hooks effectively defined...
template `=destroy`(x: Ref) = # no use of var implying "cheating"
if `x` != nil: `x`.dec # only changed to call through hooked dec
template`=`[T](x: Ref[T]; y: Ref[T] | OwnedRef[T]) =
# Note: No need to check for self-assignments here.
if `y` != nil: `y`.inc # same through hook
if `x` != nil: `x`.dec # same
`byteCopy`(`x`, `y`) # raw pointer copy
template `=move`[T](x, y: Ref[T]) = # no var implying "cheating"
# Note: Moves are the same as assignments but only for Ref not OwnedRef.
`=`(`x`, `y`)
# no real change for owned ref hooks...
tempate `=destroy`[T](x: OwnedRef[T[) = # no var implying "cheating"
if `x` != nil:
assert `x`.refcount == 0, "dangling unowned pointers exist!"
`=destroy`(`x`[])
`x`.unsafeAddr[] = nil
template `=`[T](x, y: OwnedRef[T]) {.error: "owned refs can only be moved".}
template `=move`[T]](x, y: OwnedRef[T]) = # no var implying cheating
if `x` != `y`:
`=destroy`(`x`)
`byteSwap`(`x`, `y`) # raw pointer swap leaves source as nil
Run
Now we can control which behaviour we get with the (my proposed) {.atomic.}
pragma
To me, it seems simple to implement and use yet provides exactly the tuning
capabilities we need while defaulting to the fast but not so safe case which
would likely be the more commonly used. What do you think?