Re: redefining "put" and "OutputRange"

monarch_dodra Fri, 30 Aug 2013 08:16:04 -0700

On Friday, 30 August 2013 at 14:03:36 UTC, Dmitry Olshansky wrote:

30-Aug-2013 14:53, monarch_dodra пишет:
I'm starting this thread, first, as a shamless plug to a brandnew pullI just did, which (amongst others) allows "put" to transcodeon the flyany string/char of any length, to any string/char of anylength.
This fixes issues mostly in std.format, and also allows thingslikeformattedWrite to work with pure delegates (it didn't before),as wellas output directly in wstring or dstring format (awesome forwriting to
a UTF-16 file, for example).
Which is all good and well, but seeing this:

static
if (is (R == T function(const( char)[]), T) || is (R == Tdelegate(const( char)[]), T))
        enum isSink = 1;
else static if (is (R == T function(const(wchar)[]), T) ||is (R == T delegate(const(wchar)[]), T))
        enum isSink = 2;
else static if (is (R == T function(const(dchar)[]), T) ||is (R == T delegate(const(dchar)[]), T))
        enum isSink = 4;
    else
        enum isSink = 0;
Doesn't inspire confidence - it's special casing on (w|d)chararrays, again... Let's hopefully stop spreading this plaguethroughout especially under banners of generality.

It's a special case for sinks, yes. I'm not a fan of this, but Ithink it is the *single* cases we can trust. (More on this bellow)

--------
The real reason I'm starting this thread is I believe thecurrent way"put" leads to a *MASSIVE*, *HORRIFYING* issue. I dare not sayit:
Escaping references to local stack variables (!!!).
It is a dangerous primitive. It's not a good idea to wrapeverything with safe bags and specialize a single case - arraysand not even appender of (w|d)char[].
Instead it's once again a case where primitive needs betterhigh-level contract inexpressible in simply terms such as@safe-ty provides.
The rule is: OutputRange must not hold references to any slicesgiven.
And is trivially true for many of current ranges.

OutputRange really just means that put(r, e) resolves one way oranother. And it also fundamentally depends on what you considerthe "element type".

For example, int[][] is an output range for the element int[]. Itmakes a copy of said element (int[]), but it certainly *won't*copy the contents of that slice.

Basically, if R accepts an "E[]", than put will accept asingle Eelement as input, and convert it to an "E[]" on the fly, using"put(r,(&e)[0 .. 1]);". I'm sure you can see the problem. It allowsthings such
as:

//----
void main()
{
    Appender!(int[][]) app; //A type that accumulates slices
    put(app, 1);
    put(app, 2);
    put(app, 3);
    writeln(app.data); //prints: [[3], [3], [3]]
}
//----
Bad code regardless. The bug is in Appender of slices aliasingthem instead of copying the data thus breaking the above rule.BTW it will break in similar fashion with any alias-able typeI've no idea how one would help it.

The bug most certainly isn't in Appender. Appender's job is toaccumulate *slices*, and is exactly what it is doing. The callercode might be incorrect, but std.range.put *is* accepting it, andis doing a terrible job at it.

I'd like to make a proposition: "put" needs to be changed to*not*accept putting an E into something that accepts E[]. There issimply *noway* to do this safely, and without allocating (both of which,IMO, are
non-negotiable).
Just relax and step back for a moment. The bug in question ispainfully easy to blowup so chances for it being HORRIBLE arequite low (it's a loud bug). Safety is cool but I expect thatoutput ranges are designed with idea of copying somethingsomewhere or absorbing or accumulating.

I'd agree, if output ranges were actually "designed". Right now,the basic definition is that an "OutputRange" collects"Elements". "put" extends the supported "Elements".

The truth is that format sinks "(const(char)[]){}" is the *only*OutputRange that collects "Elements", but whose' signture is onethat accepts a slice. This "flaws" the slice/element notion.

If format sinks were defined as "(char){}" to begin with, theneverything would work fine (and *does*), but this is not the casetoday, and that is the *only* reason I made an exception for them.

For objects that define put/opCall, then it is not verycomplicated tohave two different signatures for "put(E[])"/"opCall(E[])"*and*"put(E)"/"opCall(E)". This makes it explicit what is and isn'taccepted.
And that will subtly break some genuinely fine code...


It would "explicitly" break code that may (or may *not*) be fine.

Lucky enough, the problem never existed with input ranges:"int[][]"
never accepted "int", so there is no problem there.
This is it - a confusion between output range of int[]'saccepting them one by one and of int and accepting them inchunks.

I think the problem is "put" overstepping its boundaries. If"r.put(someSlice)" compiles, "put" has no reason to think that Ractually owns the elements in the slice.

Re: redefining "put" and "OutputRange"

Reply via email to