> all allocations are shared by default.

@Araq, the above means that all ref's are shared, which means all unowned 
"dangling" ref's are shared, which unowned ref's can be passed to several 
another threads as long as the calling thread's lifetime is longer than the 
threads to which the unowned ref's are passed. In turn, this means that 
"refcount" accesses must be atomic when "threads" is on as they are now 
multi-thread. This has a pretty big impact on performance for shared unowned 
ref's.

I believe that the passing of unowned ref's is a valid use case where the 
controlling thread, which often will be the main thread, needs to divide up 
work while retaining ownership of the container such might contain such as 
string's and/or seq's For the string's/seq's themselves, atomic accessors are 
an option that can be enabled or bypassed as per the requirements of inter 
thread use, but ref counting is part of the definition for ref's.

I suppose a possible solution is to be able to define =destroy/=copy`=move for 
a sub type of unowned ref (not necessary for owned ref because whenever they 
deal with ref counts on destruction, it is expected that their thread is the 
only access point) when one is going to use it this way so the default can be 
not to use atomic refcount's but to add "atomicity" as required. Would this be 
possible in your implementation? Can it be done simply? One idea would be to 
have an accessor for the operations on refcount that can be overridden, 
something like the following: 
    
    
    # defined byteCopy and ByteSwap
    template `byteCopy`(dst, src: typed) =
      copyMem(`dst`.unsafeAddr, src.unsafeAddr, `src`.sizeof)
    template `byteSwap`(dst, src: typed) =
      var tmp: `src`.type
      copyMem(tmp.unsafeAddr, `src`.unsafeAddr, `src`.sizeof)
      copyMem(`src`.unsafeAddr, `dst`.unsafeAddr, `src`.sizeof)
      copyMem(`dst`.unsafeAddr, tmp.unsafeAddr, `src`.sizeof)
    
    # with the actual underlying object that contains the refcount as follows
    type
      RefObj[T] = object
        pntr: T
        refcount: Natural
    proc `inc`[T](ro: var RefObj[T]) {.inline.} = ro.refcount.inc
    proc `dec`[T](ro: var RefObj[T]) {.inline.} = ro.refcount.dec
    
    # and the atomically ref counted version
    type RefObjAtomic[T] = distinct RefObj[T]
    proc `inc`[T](ro: var RefObjAtomic[T]) {.inline.} = ro.refcount.atomicInc
    proc `dec`[T](ro: var RefObjAtomic[T]) {.inline.} = ro.refcount.atomicDec
    
    type
      Ref[T] = ptr RefObj[T] | ptr RefObjAtomic[T]
      OwnedRef[T] = distinct Ref[T]
    proc `[]`[T](x: Ref[T] | ownedRef[T]) {.inline.} = cast[ptr T])(x) # no 
double deref
    
    # with the default unowned ref hooks effectively defined...
    template `=destroy`(x: Ref) = #  no use of var implying "cheating"
      if `x` != nil: `x`.dec # only changed to call through hooked dec
    
    template`=`[T](x: Ref[T]; y: Ref[T] | OwnedRef[T]) =
      # Note: No need to check for self-assignments here.
      if `y` != nil: `y`.inc # same through hook
      if `x` != nil: `x`.dec # same
      `byteCopy`(`x`, `y`) # raw pointer copy
    
    template `=move`[T](x, y: Ref[T]) = # no var implying "cheating"
      # Note: Moves are the same as assignments but only for Ref not OwnedRef.
      `=`(`x`, `y`)
    
    # no real change for owned ref hooks...
    tempate `=destroy`[T](x: OwnedRef[T[) = # no var implying "cheating"
      if `x` != nil:
        assert `x`.refcount == 0, "dangling unowned pointers exist!"
        `=destroy`(`x`[])
        `x`.unsafeAddr[] = nil
    
    template `=`[T](x, y: OwnedRef[T]) {.error: "owned refs can only be moved".}
    
    template `=move`[T]](x, y: OwnedRef[T]) = # no var implying cheating
      if `x` != `y`:
        `=destroy`(`x`)
        `byteSwap`(`x`, `y`) # raw pointer swap leaves source as nil
    
    
    Run

Now we can control which behaviour we get with the (my proposed) {.atomic.} 
pragma

To me, it seems simple to implement and use yet provides exactly the tuning 
capabilities we need while defaulting to the fast but not so safe case which 
would likely be the more commonly used. What do you think?

Reply via email to