Possible new COW/copy suggestions?

Era Scarecrow Sat, 21 Aug 2010 12:20:46 -0700

   I was reading the book on D by Andrei Alexandrescu, and it suddenly occurred 
to me, perhaps there should be a couple special case copy methods for 
Copy-on-write (COW) which work on arrays only. (on single variables it does 
nothing special, since changes would just replace the variable's contents). I 
have a copying suggestion for structures.


  You _can_ live without these, but they would make certain tasks and cases a 
lot less repetitive and error prone.


  For arrays using COW, I'm using DMD's toupper function as a reference for how 
this would work/affect code. http://www.digitalmars.com/d/2.0/memory.html

--Strings (and Array) Copy-on-Write

char[] toupper(char[] s)
{
    int i;

    for (i = 0; i < s.length; i++)
    {
        char c = s[i];
        if ('a' <= c && c <= 'z')
            s[i] = c - (cast(char)'a' - 'A');
    }
    return s;
}

  In a later example walter used would definitely work, but what if the 
compiler did most of the work for us? Say, adding a keyword like cowref? Then 
the only visible change would be in the definition signature.

char[] toupper(cowref char[] s)

  Internally it would add a flag, so just before it changes the the array, it 
would check the flag and if it hasn't been done yet, makes a duplicate copy. 
With this in mind, it can be treated as an (const/in) to calling functions and 
thought of as inout inside the function, this allows accepting of 
const/immutable data. These could be a permanent change in how arrays work for 
these features too, or maybe a subtype of array for these specific calls.

char[] toupper(cowref char[] s)
{
    bool __cow_s = true;
    int i;

    for (i = 0; i < s.length; i++)
    {
        char c = s[i];
        if ('a' <= c && c <= 'z') {
            if (__cow_s) {
               /*make copy*/
                __cow_s = false
            }
            s[i] = c - (cast(char)'a' - 'A');
        }
    }
    return s;
}

 For optimization involving only one cowref, the compiler may end up making two 
copies of the function with a additional label/goto so when it would be able to 
modify the code the first time, it would copy and then branch to the copy so 
the check isn't done on every pass. ex:


char[] toupper(cowref char[] s)
{
    int i;

    for (i = 0; i < s.length; i++)
    {
        char c = s[i];
        if ('a' <= c && c <= 'z') {
            /*changes made in this scope, everything but the array copying
              is removed. */
            goto __cowref_jump;
        }
    }
    return s;

    /*only copies code it can possibly return to, in a loop or goto jumps*/
    for (; i < s.length; i++)
    {
        char c = s[i];
        if ('a' <= c && c <= 'z') {
/*continue point at start of scope*/
__cowref_jump:
            s[i] = c - (cast(char)'a' - 'A');
        }
    }
    return s;
}

  Second thought is for when you want to refer to the original array, but only 
copy specific elements (rather than the whole array) forward. This would be 
useful especially when doing sector referencing of 512 bytes or larger as an 
individual block. Perhaps cowarray would be used. The array would work 
normally, but with only a couple extra lookups. This would also accept 
const/immutable data.

char[] toupper(cowarray char[] s)
{
//if known it's returning the array, it might precopy the original.
//but if it does that, the bool change array is probably unneeded unless
//you need to know if specific parts of the array were changed. Which
//means it may just become a cowref instead of a cowarray.
//bool still needed for multi-dimensional arrays.
    bool[] __cowarr_change_s = new bool[s.length];
    char[] __cowarr_arr_s;

    int i;

    for (i = 0; i < s.length; i++)
    {
//if changed, use change
//If the compiler sees it will never go other this again, it may
//skip this check and just read.
        char c = __cowarr_change_s[i] ? __cowarr_arr_s[i] : s[i];
//precopy
//      char c = __cowarr_change_s[i];

        if ('a' <= c && c <= 'z') {
//change and ensure it's changed on the flag.
            __cowarr_arr_s[i] = c - (cast(char)'a' - 'A');
            __cowarr_change_s[i] = true;
        }
    }

   /*when copying out or to another array or duplicating, the current view
     is used without the cow part active.*/
    return s;
}

  If you needed to know if it changed on that block, perhaps .changed can be 
used and the compiler would return the true/false.
  if(s[i].changed) { /*code/*
//becomes
  if(__cowarr_change_s[i]) { /*code*/

  Finally, the last suggestion involves structure copying. When copying a 
structure it does a bitwise copy, however when you work with references to 
arrays/structures/classes, you may want to make a duplicate rather than refer 
to the original.

//Book example, pg 246
struct Widget {
   private int[] array;
   this(uint length) {
      array = new int[length];
   }
   // Postblit constructor
   this(this) {
      array = array.dup;
   }
   /*other code*/
}

  Perhaps a keyword like oncopy(copy function defaults to dup) or 
onstructcopy(<-same) can be used. the compiler would gather all the oncopy's 
and make a default this(this) using them. If you need anything more 
complicated/extra during the copy, your definition of this(this) would execute 
after the compiler built one (appended to the compiler generated one.) Ex:

struct Widget {
//   private oncopy(dup) int[] array; 
//       Name of function (is/could be) optional if the function dup
//       is used to create a copy. might be used as oncopy!(dup)
   private oncopy int[] array;

     this(this) {
         //compiler generated oncopy's
            array = array.dup; //dup is the copy name, which could be clone or 
something else.

         // User definition (if any) Appended here.
     }

   /*other code*/
}

  Naturally, immutable data doesn't need to copy since it doesn't change; 
however if it does change during the copy the user would likely end up doing it 
manually, so using oncopy on immutable data would cause an error.

 Comments and suggestions? I'd like to hear Walter's feedback and opinions on 
these.

 Era

Possible new COW/copy suggestions?

Reply via email to