Re: [phobos] I think we need to make an emergency release

Andrei Alexandrescu Wed, 25 Aug 2010 14:36:37 -0700

I needed Appender to be a reference type for std.format and std.stdio.The point is to be able to compose a complex stream read into a sequenceof simpler reads. This is possible if you use one Appender throughout,but I don't see why preclude people from appending output to their ownstrings. It's not a difficult to implement feature (my convoluted codenotwithstanding), and it's useful. Why decide to not provide it?


Andrei


On 8/25/10 14:13 PDT, Steve Schveighoffer wrote:





----- Original Message ----

From: Andrei Alexandrescu<[email protected]>
To: Discuss the phobos library for D<[email protected]>
Sent: Wed, August 25, 2010 4:47:35 PM
Subject: Re: [phobos] I think we need to make an emergency release

On 8/25/10 12:48 PDT, Steve Schveighoffer wrote:

One very very notable  difference from the original appender is it does not

take

a pointer as  the parameter.  This means it does not update the original

array

  you pass to it (well, it does, but will not modify the length).


I don't  think this is good. When you format stuff you do want to append
the  formatted result incrementally to the string.


All it does, and I've made these changes in a couple places, is change this:

auto app = appender(&buffer);
...
return buffer;

to this:

auto app = appender(buffer);
...
return app.data;

You say it's needed, but in reality I didn't find a single compelling use case
in phobos, and Appender is used all over phobos.  Most cases looked like this:

string buffer;
auto app = appender(&buffer);

and no reference to buffer later.  Those were trivially replaced with this:

auto app = appender!string();

There was one case in stdio.d where appender is used to convert a utf-16 file to
utf-8, but appender already has code to take care of that (not added by me).
The new code looks less confusing to me, you be the judge:

[ste...@steveslaptop phobos]$ svn diff std/stdio.d
Index: std/stdio.d
===================================================================
--- std/stdio.d    (revision 1925)
+++ std/stdio.d    (working copy)
@@ -2099,8 +2099,8 @@
           * Read them and convert to chars.
           */
          static assert(wchar_t.sizeof == 2);
-        auto app = appender(&buf);
-        buf.length = 0;
+        auto app = appender(buf);
+        app.clear();
          for (int c = void; (c = FGETWC(fp)) != -1; )
          {
              if ((c&  ~0x7F) == 0)
@@ -2120,11 +2120,13 @@
                      }
                      c = ((c - 0xD7C0)<<  10) + (c2 - 0xDC00);
                  }
-                std.utf.encode(buf, c);
+                //std.utf.encode(buf, c);
+                app.put(cast(dchar)c);
              }
          }
          if (ferror(fps))
              StdioException();
+        buf = app.data;
          return buf.length;
      }

@@ -2138,20 +2140,16 @@
           * cases.
           */
        L1:
-        if(buf.ptr is null)
-        {
-            sz = 128;
-            auto p = cast(char*) GC.malloc(sz, GC.BlkAttr.NO_SCAN);
-            buf = p[0 .. 0];
-        } else {
-            buf.length = 0;
-        }
+        auto app = appender(buf);
+        app.clear();
+        if(app.capacity == 0)
+            app.reserve(128); // get at least 128 bytes available

-        auto app = appender(&buf);
          int c;
          while((c = FGETC(fp)) != -1) {
              app.put(cast(char) c);
-            if(buf[$ - 1] == terminator) {
+            if(c == terminator) {
+                buf = app.data;
                  return buf.length;
              }

===========================

I also suspect that the way this function's return value is typed is based on
the 'take-an-array-pointer' mode of Appender.  I think it'd be better typed this
way:

private char[] readlnImpl(FILE* fps, char[] buf = null, dchar terminator = '\n')

That would get rid of the "set buf before returning" part of the code, you just
return app.data.

There were several cases where someone did something like this:

buffer.length = 0;
auto app = appender(&buffer);

or

auto app = appender(&buffer);
buffer.length = 0;

This is trivially replaced with this:

auto app = appender(buffer);
app.clear();

This tells appender, you can own and replace the data in buffer.  It will not
append outside of buffer, rather it will realloc as necessary.

Other than that, here  is the updated API (with private data shown for an

idea of

how the  implementation works):

struct Appender(A : T[], T)
  {
      private struct Data
       {
          size_t capacity;
           Unqual!(T)[] arr;
       }

      private Data* _data;


I think this  looks good - why not make arr of type pointer to array and
support append to  existing arrays?


Safety concerns.  You are most likely passing a reference to a stack-based array
reference, so if you escape the Appender, you escape the stack-based reference.
This isn't strictly necessary because the most important thing is the data.
Now, you could escape stack data by passing a reference to a stack-allocated
buffer, but I think there's no way around this.

The thing is, when requiring passing a pointer, you are *guaranteeing* the
chance for escaping.  Even when you don't care about the original reference.
The fact that about half of phobos usages of appender looked like this:

string buffer;
auto app = appender(&buffer); // appender contains a pointer to stack data now!

tells me that people will make this mistake.

/**
Construct an appender with a given  array.  Note that this does not copy the
data, but appending to the  array will copy the data, since Appender does

not

know if this data has  valid data residing after it.  The initial capacity

will

be
  arr.length.
*/
      this(T[]  arr);


Appender was partly created to avoid the issue the comment warns  about.


All it means is, don't expect what you do to the appender to be reflected in the
original array reference.  I think designing to support this is a huge safety
problem, and provides almost no useful benefit.  If you have the appender, you
can get the appender's data,  why do you have to use the original reference?.
Also note any aliases to the original array reference won't be updated.

-Steve




_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

_______________________________________________
phobos mailing list
[email protected]
http://lists.puremagic.com/mailman/listinfo/phobos

Re: [phobos] I think we need to make an emergency release

Reply via email to